Uniword uses a pure object-oriented approach where each XML file in the DOCX ZIP package is represented by a dedicated lutaml-model class. This eliminates the serialization/deserialization anti-pattern and provides perfect round-trip fidelity.

1. Package Class

The Docx::Package class is the top-level container for all parts of a DOCX file:

class Package < Lutaml::Model::Serializable
  # Metadata (fully modeled)
  attribute :core_properties, CoreProperties      # docProps/core.xml
  attribute :app_properties, AppProperties        # docProps/app.xml

  # Theme (fully modeled)
  attribute :theme, Theme                         # word/theme/theme1.xml

  # Document content (in progress)
  attribute :document, Document                   # word/document.xml
  attribute :styles, StylesConfiguration          # word/styles.xml
  # ... other parts

  def self.from_file(path)
    # Load DOCX and deserialize all parts
  end

  def to_file(path)
    # Serialize all parts and package as DOCX
  end
end

2. Key Attributes

Each attribute maps directly to an XML part inside the DOCX ZIP:

Attribute XML Part Description

core_properties

docProps/core.xml

Dublin Core metadata (title, author, dates)

app_properties

docProps/app.xml

Application metadata (pages, words, characters)

theme

word/theme/theme1.xml

Theme definition (colors, fonts, formatting)

document

word/document.xml

Main document body (paragraphs, tables, sections)

styles

word/styles.xml

Style definitions (paragraph, character, table)

3. Loading and Saving

# Load an existing DOCX file
package = Docx::Package.from_file('document.docx')

# Access document content
paragraphs = package.document.body.paragraphs

# Modify content
package.document.body.add_paragraph("New paragraph")

# Save back to file
package.to_file('modified.docx')

3.1. Profile-Driven Saving

Pass a Profile to populate DOCX parts with Word-expected content:

profile = Uniword::Docx::Profile.load(:word_2024_en)
package.profile = profile
package.to_file('output.docx')

4. Module Architecture

The Package class is decomposed into focused modules following the single responsibility principle:

PackageDefaults

Factory methods for constructing minimal DOCX packages (minimal_content_types, minimal_package_rels, minimal_document_rels). Provides the DOCUMENT_TO_PACKAGE_MAPPINGS hash for open/closed copying of document parts.

PackageSerialization

Content type injection and XML serialization for all package parts. Handles image, chart, header, footer, footnote, bibliography, and custom XML injection. Separates the "what to write" phase from the "how to write it" phase.

This decomposition keeps Package itself focused on loading and delegation (~390 lines), while each module handles one responsibility.

5. Benefits

Zero hardcoding

All XML generation is handled by lutaml-model, not string concatenation.

Type safety

Strong typing for all attributes means type errors are caught early.

Perfect round-trip

Model serialization guarantees that all content is preserved during load/save cycles.

Easy testing

Each model class is independently testable without needing a full DOCX package.

Maintainability

Changes to OOXML handling are isolated to model definitions, not scattered across the codebase.