Text this: Extending the scope of corpus-based research