Just as we use machine translation technology to help bring speed, consistency, and cost-controls to large translation projects, language historians are leveraging the processing power of computer scientists to help crack some of the big questions about the origins of present-day languages.
Operating on the theory that vocabulary evolves like branches growing from a central trunk, the program uses an algorithm to analyze contemporary Southeast Asian languages such as Javanese, Malay, and Tagalog to find commonalities thousands of years old. This approach essentially scales the brains of historical linguists, transforming what was previously a much slower human process into a high-speed search for proto-languages. Results so far have been impressive, delivering results which favorably compare to manual reconstruction 85 percent of the time.
The human element is still important, though. Certain analytical aspects remain beyond the realm of the computer, such as the impact of sociological and geographical elements. As Leslie Katz details in the original article, what turns “cat” into “kitty cat” can’t quite be nailed down by the machine yet.
Dan Klein, of the computer science department at UC Berkeley, likens the relationship of linguists to this automated historical research to the coexistence of astronomers and computerized telescopes. The program is a tool, but without the human at the helm, the meaning may be lost. We find this is also the case with aspects of machine translation projects we work on for clients at Acclaro. Machine translation with pre- and post-human editorial oversight still hits the sweet spot for quality through automation.
While the program has (so far) attempted a look at the origins of languages, much like those on the Rosetta Stone pictured above, there’s hope that it might one day help us glimpse into the future. Who knows, maybe one day we’ll be able to localize our time capsules for the English of the next millennia.
Photo attribution: bortescristian