Uniword is optimized for performance with large documents through lazy loading, efficient serialization, and optimized ZIP handling.
1. Performance Features
- Lazy loading
-
The autoload strategy ensures that only needed classes are loaded. Opening a simple document does not load the 760 element classes.
- Streaming parsers
-
Large files are processed using streaming XML parsers that do not load the entire document into memory at once.
- Efficient XML serialization
-
Lutaml-model provides optimized XML serialization that minimizes object allocation and string operations.
- Optimized ZIP handling
-
The rubyzip-based ZIP handler reads and writes DOCX packages efficiently, processing parts on demand.
2. Tips for Large Documents
For documents with thousands of paragraphs or large embedded images:
-
Use
Document.openinstead of loading the entire document into memory when only reading -
Process paragraphs in batches rather than materializing the entire collection
-
Consider splitting very large documents into sections
# Efficient document processing
doc = Uniword.load('large-document.docx')
# Process paragraphs without creating intermediate arrays
doc.paragraphs.each do |para|
process(para)
end
# Save with optimized serialization
doc.save('output.docx')
3. Autoload Benefits
The 95% autoload strategy provides measurable startup improvements:
-
90% fewer classes loaded at startup compared to eager loading
-
Memory footprint scales with actual document complexity
-
Unused namespaces (math, charts, presentations) stay unloaded
4. See Also
-
Autoload Strategy — Details on autoload coverage
-
Architecture — How layers minimize overhead