Table of Contents
Uniword supports processing multiple documents in batch, including bulk conversion, validation, and transformation workflows.
1. Batch Conversion
Convert multiple DOCX files to MHTML (or vice versa):
require 'uniword'
Dir.glob('documents/*.docx').each do |path|
doc = Uniword.load(path)
output = path.sub('.docx', '.mhtml')
doc.save(output)
puts "Converted: #{path} -> #{output}"
rescue Uniword::Error => e
warn "Failed: #{path} - #{e.message}"
end
2. Batch Verification
Verify multiple documents and collect results:
require 'uniword'
require 'json'
results = {}
Dir.glob('documents/*.docx').each do |path|
result = Uniword::Verification.verify(path)
results[path] = {
valid: result.valid?,
errors: result.issues.count { |i| i.severity == :error },
warnings: result.issues.count { |i| i.severity == :warning }
}
end
puts JSON.pretty_generate(results)
3. Parallel Processing
For CPU-intensive operations on large document collections, use Ruby’s threading:
require 'uniword'
paths = Dir.glob('documents/*.docx')
mutex = Mutex.new
results = []
paths.each_slice(4) do |batch|
threads = batch.map do |path|
Thread.new(path) do |p|
result = Uniword::Verification.verify(p)
mutex.synchronize { results << [p, result.valid?] }
end
end
threads.each(&:join)
end
results.each { |path, valid| puts "#{path}: #{valid ? 'OK' : 'FAILED'}" }
|
Uniword’s thread safety depends on lutaml-model’s thread safety. Each thread should create its own document instances rather than sharing them. |
4. CLI Batch Processing
Use shell scripting for simple batch operations:
# Verify all DOCX files in a directory
for f in documents/*.docx; do
echo "Verifying: $f"
uniword verify "$f" --json > "reports/$(basename "$f" .docx).json"
done
# Convert all DOCX files to MHTML
for f in documents/*.docx; do
ruby -runiword -e "Uniword.load('$f').save('${f%.docx}.mhtml')"
done