Task: Build a modular system that takes a PDF of a scanned journal, extracts pictorial and textual data, performs an analysis of the various data types, and saves the results for later statistical analysis. Preferred tools: Python or Matlab