Guarantees and verification for perfect DOCX round-trip preservation.

1. Round-Trip Guarantee

Uniword achieves 100% round-trip fidelity for DOCX documents through complete OOXML modeling. Every OOXML element is represented as a Ruby object via lutaml-model, ensuring that load-modify-save cycles preserve all original content.

Guarantee Details

Content preservation

All text, formatting, and structure maintained

Element coverage

All 760 OOXML elements from 22 namespaces modeled

Namespace compliance

All required OOXML namespaces supported

Encoding

UTF-8 encoding maintained throughout

Model-driven

All XML structures represented as Ruby objects

2. Round-Trip Example

# Load document
original = Uniword::Document.open('complex.docx')

# Modify it
original.add_paragraph("New content")

# Save back -- EVERYTHING preserved
original.save('modified.docx')

# Verify: modified.docx has ALL original content + new paragraph

3. Verified Test Results

Test Case Size Nodes Result

ISO 8601-1:2019/Amd1 DOCX

295KB

 — 

0 normative differences

ISO 690:2021 DOCX

4.8MB

~130K

0 normative differences

ISO DIS 5878 DOCX

29.4MB

970K

0 normative differences

MHTML documents

varies

 — 

Content preserved

Math equations

varies

 — 

Preserved via m: namespace

Bookmarks

varies

 — 

ID preservation

These results were achieved after implementing complete property coverage for paragraph, spacing, table, and section elements.

3.1. Properties Added for ISO Round-Trip Pass

The following properties were added to achieve 0 normative differences on ISO documents:

Element Properties

ParagraphProperties

autoSpaceDE, autoSpaceDN, adjustRightInd, pageBreakBefore, widowControl (converted to BooleanElement), sectPr

Spacing

beforeAutospacing, afterAutospacing

TableLayout

Converted to wrapper class with type attribute

TableCellProperties

hideMark

TableRowProperties

divId

SectionProperties

rsidSect

PageNumbering

fmt

Paragraph

rsidDel (revision deletion tracking)

Each Boolean property uses a BooleanElement wrapper class that correctly round-trips the XML presence/absence pattern used by OOXML.

4. Verification with the CLI

Use the uniword verify command to validate document structure:

# Full verification (OPC + semantic)
uniword verify document.docx

# Enable XSD schema validation (slower, thorough)
uniword verify document.docx --xsd

# Machine-readable output
uniword verify document.docx --json

The verifier runs three layers of checks:

  1. OPC Package — ZIP integrity, content types, relationships, part presence

  2. XSD Schema — XML schema validation against 40 bundled XSD schemas

  3. Word Document — 10 built-in semantic rules for cross-references, styles, numbering, footnotes, headers, bookmarks, images, tables, fonts, theme, and settings

5. Custom Validation

Register custom validation rules for domain-specific checks:

Uniword::Validation::Rules.register(MyCustomRule)

6. Profile-Driven Round-Trip

When generating new documents, provide a Profile to ensure the output matches Microsoft Word’s expected format without triggering the repair dialog:

# Load a document and save with a profile
doc = Uniword::Document.open("input.docx")
profile = Uniword::Docx::Profile.load(:word_2024_en)
doc.save("output.docx", profile: profile)

# The Reconciler populates all 11 items Word checks:
# - Content types, relationship ordering
# - Settings defaults (zoom, compat, mathPr, etc.)
# - Font table entries with full metadata
# - Style definitions (docDefaults, latentStyles)
# - Web settings, app/core properties
# - Theme, tracking attributes