SMS: Multilingual corpora and multilingual corpus analysis