Text this: Multimedia information extraction and digital heritage preservation