Top

blog

Stories

 

Twitter Gets Help from SLU Prof on How to Deal With Indigenous Tweeters

Thumbnail image for twitterlogo1.jpg
If you're one of the five remaining speakers of "Yuchi" -- a near-extinct Native American language in Oklahoma -- your tweets will look insane, even to those within your linguistic group.

That's because the "@" character is part of your alphabet, so whenever you type it in, Twitter will wrongly think you're using Twitterese to refer to a different user, such as @Joe_Smith. 

This is the kind of programming problem that Twitter is coming across more and more as it tries to make inroads where minority languages hold sway. And it's exactly the kind of problem that a computational linguist such as Professor Kevin Scannell of St. Louis University is equipped to solve.

Since October, Scannell -- on sabbatical from SLU's Department of Math and Computer Science -- has been flying out to Twitter's headquarters in San Francisco one week per month to consult with their international team on stuff like this. Or how about this one:

When folks label their tweets with "hashtags," they type "#"  then add text that flows to the right, as in #OccupyWallStreet. But what about Arabic, which flows in the opposite direction? Or what if, in the middle of a tweet in Arabic, the user wants to write "Hilary Clinton"?

Scannell was a member of the Twitter team that rewrote the code to handle such linguistic miscegenation.

IndigTweetcap.JPG
These people tweet in Hatian Creole
The California web company discovered Scannell last summer when they became aware of his pet project, a website called "Indigenous Tweets." The site uses automated processes to trawl the vast ocean of Twitter for obscure tongues. It then groups those users together and tracks their usage.

About a week ago, Scannell was surprised to see his site mentioned by The Economist.

Daily RFT called him and wanted to know: Why did he start the site in the first place?

"There was a personal aspect to the work," says Scannell, who in addition to his native English also speaks Gaelic, used by only 20,000 people or so in western Ireland. "One of the things we've been encouraging in Irish is the use of social media, but on Twitter, we were having trouble finding other speakers. So this was me personally trying to find other people who spoke my language. Then, that approach to Irish we took to other languages."

Scannell's site is now tracking 129 indigenous languages on Twitter. The five most common, by number of users, are:

1) Hatian Creole (14,259 users)
2) Basque (7,063 users)
3) Welsh (4,808 users)
4) Irish Gaelic (2,712 users)
5) Frisian (2,034 users)

Of course, at the other end of the list are 28 languages with a only one lonely tweeter, such as Gamilaraay (in Southeastern Australia) and Wayuunaikai (in northeastern Colombia).

But Scannell says there are plenty of indigenous languages on Twitter he hasn't even tracked yet, including Yuchi, the language we mentioned first that uses the "@" in its alphabet. (Yuchi does boast at least one tweeter).

"I mentioned [Yuchi] to the people at Twitter yesterday," says Scannell, who has just returned from a trip out west. "I jokingly said they should change the way they do user names just to accommodate the Yuchi community."

He concludes: "I don't think they're gonna do it."

My Voice Nation Help
2 comments
Guest
Guest

There are more than 20,000 Irish speakers in Ireland. Also, it is not confined to the west. There are Gaeltacht (Irish speaking regions) in the North, South, East and West of Ireland as well as many speakers in more urban areas.

Kevin has done some great work for the Irish language and for many other minority languages.

Reader
Reader

Did you mean to write miscegenation?

Now Trending

St. Louis Concert Tickets

From the Vault

 

General

Loading...