Seol mar théacs é seo: Steps toward large-scale data integration in the sciences