Guarantees and verification for perfect DOCX round-trip preservation.
1. Round-Trip Guarantee
Uniword achieves 100% round-trip fidelity for DOCX documents through complete OOXML modeling. Every OOXML element is represented as a Ruby object via lutaml-model, ensuring that load-modify-save cycles preserve all original content.
| Guarantee | Details |
|---|---|
Content preservation |
All text, formatting, and structure maintained |
Element coverage |
All 760 OOXML elements from 22 namespaces modeled |
Namespace compliance |
All required OOXML namespaces supported |
Encoding |
UTF-8 encoding maintained throughout |
Model-driven |
All XML structures represented as Ruby objects |
2. Round-Trip Example
# Load document
original = Uniword::Document.open('complex.docx')
# Modify it
original.add_paragraph("New content")
# Save back -- EVERYTHING preserved
original.save('modified.docx')
# Verify: modified.docx has ALL original content + new paragraph
3. Verified Test Results
| Test Case | Size | Nodes | Result |
|---|---|---|---|
ISO 8601-1:2019/Amd1 DOCX |
295KB |
— |
0 normative differences |
ISO 690:2021 DOCX |
4.8MB |
~130K |
0 normative differences |
ISO DIS 5878 DOCX |
29.4MB |
970K |
0 normative differences |
MHTML documents |
varies |
— |
Content preserved |
Math equations |
varies |
— |
Preserved via |
Bookmarks |
varies |
— |
ID preservation |
These results were achieved after implementing complete property coverage for paragraph, spacing, table, and section elements.
3.1. Properties Added for ISO Round-Trip Pass
The following properties were added to achieve 0 normative differences on ISO documents:
| Element | Properties |
|---|---|
ParagraphProperties |
|
Spacing |
|
TableLayout |
Converted to wrapper class with |
TableCellProperties |
|
TableRowProperties |
|
SectionProperties |
|
PageNumbering |
|
Paragraph |
|
Each Boolean property uses a BooleanElement wrapper class that correctly
round-trips the XML presence/absence pattern used by OOXML.
4. Verification with the CLI
Use the uniword verify command to validate document structure:
# Full verification (OPC + semantic)
uniword verify document.docx
# Enable XSD schema validation (slower, thorough)
uniword verify document.docx --xsd
# Machine-readable output
uniword verify document.docx --json
The verifier runs three layers of checks:
-
OPC Package — ZIP integrity, content types, relationships, part presence
-
XSD Schema — XML schema validation against 40 bundled XSD schemas
-
Word Document — 10 built-in semantic rules for cross-references, styles, numbering, footnotes, headers, bookmarks, images, tables, fonts, theme, and settings
5. Custom Validation
Register custom validation rules for domain-specific checks:
Uniword::Validation::Rules.register(MyCustomRule)
6. Profile-Driven Round-Trip
When generating new documents, provide a Profile to ensure the output matches Microsoft Word’s expected format without triggering the repair dialog:
# Load a document and save with a profile
doc = Uniword::Document.open("input.docx")
profile = Uniword::Docx::Profile.load(:word_2024_en)
doc.save("output.docx", profile: profile)
# The Reconciler populates all 11 items Word checks:
# - Content types, relationship ordering
# - Settings defaults (zoom, compat, mathPr, etc.)
# - Font table entries with full metadata
# - Style definitions (docDefaults, latentStyles)
# - Web settings, app/core properties
# - Theme, tracking attributes