Document analysis

Cut the PDF

Task: Build a modular system that takes a PDF of a scanned journal, extracts pictorial and textual data, performs an analysis of the various data types, and saves the results for later statistical analysis.
Preferred tools: Python or Matlab