Uniword supports processing multiple documents in batch, including bulk conversion, validation, and transformation workflows.

1. Batch Conversion

Convert multiple DOCX files to MHTML (or vice versa):

require 'uniword'

Dir.glob('documents/*.docx').each do |path|
  doc = Uniword.load(path)
  output = path.sub('.docx', '.mhtml')
  doc.save(output)
  puts "Converted: #{path} -> #{output}"
rescue Uniword::Error => e
  warn "Failed: #{path} - #{e.message}"
end

2. Batch Verification

Verify multiple documents and collect results:

require 'uniword'
require 'json'

results = {}
Dir.glob('documents/*.docx').each do |path|
  result = Uniword::Verification.verify(path)
  results[path] = {
    valid: result.valid?,
    errors: result.issues.count { |i| i.severity == :error },
    warnings: result.issues.count { |i| i.severity == :warning }
  }
end

puts JSON.pretty_generate(results)

3. Parallel Processing

For CPU-intensive operations on large document collections, use Ruby’s threading:

require 'uniword'

paths = Dir.glob('documents/*.docx')
mutex = Mutex.new
results = []

paths.each_slice(4) do |batch|
  threads = batch.map do |path|
    Thread.new(path) do |p|
      result = Uniword::Verification.verify(p)
      mutex.synchronize { results << [p, result.valid?] }
    end
  end
  threads.each(&:join)
end

results.each { |path, valid| puts "#{path}: #{valid ? 'OK' : 'FAILED'}" }

Uniword’s thread safety depends on lutaml-model’s thread safety. Each thread should create its own document instances rather than sharing them.

4. CLI Batch Processing

Use shell scripting for simple batch operations:

# Verify all DOCX files in a directory
for f in documents/*.docx; do
  echo "Verifying: $f"
  uniword verify "$f" --json > "reports/$(basename "$f" .docx).json"
done

# Convert all DOCX files to MHTML
for f in documents/*.docx; do
  ruby -runiword -e "Uniword.load('$f').save('${f%.docx}.mhtml')"
done