Uniword is optimized for performance with large documents through lazy loading, efficient serialization, and optimized ZIP handling.

1. Performance Features

Lazy loading

The autoload strategy ensures that only needed classes are loaded. Opening a simple document does not load the 760 element classes.

Streaming parsers

Large files are processed using streaming XML parsers that do not load the entire document into memory at once.

Efficient XML serialization

Lutaml-model provides optimized XML serialization that minimizes object allocation and string operations.

Optimized ZIP handling

The rubyzip-based ZIP handler reads and writes DOCX packages efficiently, processing parts on demand.

2. Tips for Large Documents

For documents with thousands of paragraphs or large embedded images:

  • Use Document.open instead of loading the entire document into memory when only reading

  • Process paragraphs in batches rather than materializing the entire collection

  • Consider splitting very large documents into sections

# Efficient document processing
doc = Uniword.load('large-document.docx')

# Process paragraphs without creating intermediate arrays
doc.paragraphs.each do |para|
  process(para)
end

# Save with optimized serialization
doc.save('output.docx')

3. Autoload Benefits

The 95% autoload strategy provides measurable startup improvements:

  • 90% fewer classes loaded at startup compared to eager loading

  • Memory footprint scales with actual document complexity

  • Unused namespaces (math, charts, presentations) stay unloaded

4. See Also