Achieving Auto-Pronounce

One of the joys I’ve found in the last couple of days working on our dictionary project at YorubaName.com is realizing that I could be contributing significantly to the future of African language use on the web. Last week, I finally achieved a sort of breakthrough on something that had worried me for a while, since the work on the dictionary project started: how do we get each of the word/name in our dictionary pronounced without breaking the bank or spending too much time? From the experience of using small dictionary apps on mobile phones, I have always known that it was not feasible to pronounce ALL the names. There would have to be a way to use technology make the process smoother and less exacting.Fullscreen capture 422015 50346 PM.bmp

 

 

 

I found that way during the week, and just wrote a blog post about it on the YorubaName blog page where I’m now spending most of my online time. Read it here.

Two crucial factors that made this possible was a collaboration of my knowledge of Yoruba phonology and my partner’s intimate knowledge of computer/software programming. I see a future in which linguists from other African language groups will collaborate with software geeks to create more projects in this direction. In my case, I can’t wait to see this be used to deal with a lexical Yoruba dictionary as well, in the future.

 

 

Kola Tubosun joins “WE THE HUMANITIES” Platform as Curator

PRESS RELEASE

25/02/2015, Lagos, Nigeria—Kola Tubosun: linguist, teacher and writer at Whitesands School, Lagos, Nigeria, will be joining the international humanities Twitter account, We the Humanities, as next week’s guest curator.

@wethehumanities is a rotation-curation account which offers a central platform for discussion and news of the humanities in all its forms. It is open to anyone working in or with the humanities in any form, and hopes to follow the success achieved by the science platform, @realscientists.

IMG_0395Kola is a linguist and aspiring lexicographer with years of work in language teaching and language documentation under his belt. He will be tweeting about mother tongue use, language endangerment, and particularly a subject that is dear to him: Yoruba language use in Nigeria. He is currently building a multimedia dictionary of Yoruba names at www.yorubaname.com . Kola can be found on twitter at @baroka, and on his blog at KTravula.com

Co-founder Jessica Sage (@academicjess) comments: “The We the Humanities project engages with people from around the world, exposing them to humanities research, experience and ideas they perhaps didn’t know existed.  Each week a new academic or practitioner takes over the account, tweeting about their work and provoking conversations about the diversity and importance of the humanities.  We really look forward to Kola Tubosun running the account this coming week, and in particular his take on linguistics, second language use, and Yoruba language.”

We the Humanities, which is now in its second year, has attracted tweeters from across six continents, ranging from professors to Masters students and from museum curators to musicians.  The discussions engage with more than 2400 followers from across the world, including everyone from lifetime specialists to the mildly curious. The account has developed to include a blog and events listings, housed at http://www.wethehumanities.org.

Kristina West (@krisreadsbooks), co-founder of WtH, adds: “We encourage anyone working within the humanities who might be interested in curating the account to get in touch through the website. We aim to create a vibrant, international community to raise awareness of the diversity, relevance and challenges that encompass what is called the humanities.”

~ends~

Contact details:

Jessica Sage, We the Humanities: tel: +44 (0)7731840380, e-mail: j.sage@reading.ac.uk

Kristina West, We the Humanities: tel: +44 (0) 7525 009744, email: k.j.west@pgr.reading.ac.uk

 

About We The Humanities:

 

A rotation-curation Twitter account showcasing the creativity and diversity of the humanities and reiterating the fact that the humanities are more widely important than current public funding suggests.

 

Like most things on Twitter, it began with a seemingly innocuous tweet.  On 5th January 2014 @academicjess asked a few people “Does anyone know of a humanities equivalent to @realscientists & if not would you be interested?” and it snowballed from there.

 

It is currently just being administrated by Jessica, Kristina and Emma Butcher (@EmmaButcher_).

 

Disclaimer: The views expressed at the @wethehumanities Twitter account are those of the weekly guest editor and not those of the administrator or previous/subsequent curators. The views expressed on the blog are the views of named posters and not those of the administrators.  

 

 

 

Why I Believe Almost All African Languages are Endangered

Guest Post by Luis Morais

 

Languages as cultural and social institutions of their peoples can either flourish, evolve and thrive or stagnate, degrade and die. There are several causes and factors that contribute to the death of a language, we could simplify things by stating that languages either die by the force of oppression (from man or nature) or by assimilation: when speakers start using a foreign language more than their own, up to the point that the dominant language swallows the regional language.

UNESCO rates how far a language is to extinction by quantifying the number of new young speakers learning and using it actively both inside and outside their homes. It is a straight-to-the-point system but it exclusively evaluates language “health” from the perspective of its spoken use without a deep consideration of how actively the regional language is used as a tool for knowledge creation instead of the dominant foreign language and how motivated regional speakers feel to preserve the knowledge acquired from previous generations.

Most African languages nowadays are safe from violent repression and technically considered alive due to the numbers of their many speakers. Nevertheless, since the world trend is that we will grow more and more dependent on digital technologies, we must consider the growing pervasiveness of the digital and online world in Africa either as:

  1. An opportunity for African languages to gain their space in the online world and thrive in the digital age;
  2. Or, where languages fail to establish an online and digital presence, the speeding up of the assimilation process where the local language is pushed out of yet another environment by the dominant language with the side effect of further eroding local speakers’ opinion of the language as useful and relevant for the digital age.

Knowledge is much wider than gadgets, corporations and factory plants

This article expands the focus of language health and longevity from new speakers learning the language and number of oral speakers to how actively the local language is used to create and preserve knowledge in the digital age. So that we are clear, whilst some view knowledge as exclusively what corporations, academic institutions and factories churn out every year, we base our argument on the fact that knowledge is much wider than that.

Within every language a trove of knowledge is to be found in the form of myths, poetry and literature, gastronomy, spirituality, herbs and natural medicines, philosophy and untranslatable concepts, music and new sounds, art, child upbringing methods, moral concepts, ways to govern and live in society, forms to express feelings and everything else.

In the oral tradition of Africa, it potentially resides a) the basis of the advances we enjoy today, b) the leads to future breakthroughs, c) but most importantly the blueprints of a sustainable way of life in this broken planet. This knowledge, essential and valuable as it is, must be preserved for the sake of the hidden lessons we might still learn from them.

If one thing, and one thing only, Africa should learn from Europe is fostering one’s own local languages

Although Africa is the home of the oral tradition, the historical evidence also shows that Africans from all over the continent have also been writing and recording their knowledge for centuries. Nevertheless, be it in Arabic, Medieval Latin, French or English, Africans from the past and present seem to have produced more physically recorded knowledge in the crusader’s and colonist’s language than in their own local languages.

In Europe, local languages stand a better chance to co-exist (instead of competing) with dominant official languages. Although the pressures of assimilation from other majoritarian European languages are still present, local languages and minority national languages in Europe have possessed a localised digital infrastructure compatible with knowledge creation in the local language for some time now.

The results are clear:

  • Catalonia as an autonomous region in Spain with 7 million Catalan speakers, boasts a book catalogue of 56.000 titles. Despite having suffered government repression in the past, nowadays the Catalan language is digitally accessible and content can be easily found online.
  • As another notable example, Iceland, with a population of a bit more than 300,000 people publishes 1,500 books in Icelandic every year.
  • In the case of Yiddish, one can easily find online book repositories with more than 10.000 free titles.
  • In the UK, there is plenty of literature produced in Welsh (430.000 speakers), Irish (2.5 million speakers) and Gaelic (87.000 speakers).

Regional and linguistic identity has always been a European theme, backed by a lucrative regional cultural industry generating millions of Euros. In the African context, it is difficult to foresee how regional languages expect to thrive in the digital age unless regional speakers find the localised tools and motivation to use their local languages in the digital space.

Why aren’t we producing more digital content in African languages?

We contemplate a future where our descendants will learn more of their identity and culture from digital and online sources. These future generations will likely listen to their oral tradition in YouTube, dance to their drums in iTunes and read about their myths in Wikipedia. These future generations will be more and more engaged in living a digital life and used to accessing and sharing information from digital sources.

In order to create local knowledge for these future “digital & online” generations, we first must allow regional language speakers to use their languages whenever and wherever they want. This is hardly possible in Africa today simply because the basic tools such as local keyboards and local digital language features are hard to find if not nonexistent.

Most African languages have poor access to digital aides such as language glossaries, text-to-speech databases, auto-correction and auto-suggestion features amongst other incredibly simple digital advances such as OCR, the technology that allows words in a picture of a book page to be recognised as words.

In simple terms, one can’t create knowledge in their local language without fully accurate and functional language tools.

As a personal example, being a speaker and writer of Brazilian Portuguese, I battled with keyboards to write in my native language when I came to the UK. Portuguese characters such as “ç” and accented letters such as “ã”, “à”, “ê”, “ó”, “ü” were impossible to type in a UK keyboard and for months I didn’t create much in my own language, reluctant to write it in a way that was orthographically wrong and open for misunderstanding.

In order to write in my own language, I tried several workarounds in the first years. I spent time cutting and pasting the accented characters from Brazilian websites into my writings; I installed intrusive virtual on-screen keyboards; and at a time I imported a European Portuguese keyboard which took a lot of physical space since my family still needed the UK keyboard around.

Just when I started using customised keyboard layouts on my UK keyboards then I was able to write from my coração (heart). Just then, I felt intellectually liberated and started creating knowledge in my own language.

Conclusion

Whilst typing machines epitomised the colonist’s oblivion (if not plain hostility) to local language and culture by locking knowledge production to a keypad in the colonist’s language, soft keyboards, computer keyboard layouts and digital language aides give the local speaker the freedom to produce knowledge in as many languages as they want or need.

The technology is here and has been available for some time, nevertheless in order to give regional African languages a fighting chance in the digital age, more needs to be done in order to create easily accessible regional language tools that allow one to exercise their regional language and culture with full digital language support.

I link the low usage of local languages to produce digital and online knowledge to both technological and social reasons:

  1. There are insufficient localised input tools that allow one to write in their local language correctly.
  2. Attempts to write the language with an incompatible input tool generates imperfect language content without the right characters, accents and tonal marks.
  3. Content created without the right orthography dilutes the language. For not being standard it is harder to find in search engines.
  4. The lack of digital language tools such as glossaries, dictionaries, OCR, auto-correct and auto-suggest features amongst others demotivate less experienced speakers of the local African language.
  5. By seeing their local language under- and misrepresented in the digital and online world, speakers don’t feel their regional language is relevant for the digital age.
  6. By being forced to make more effort to write in their regional language correctly, speakers decide not to use it as often as the dominant language to avoid the extra work.

Once the barriers above are removed and local digital language tools are created, part of the mission to allow African languages to represent themselves in the digital world will start to be accomplished in the form of the unblocked production of regional language content. In other words, make the tools for localised knowledge production accessible and everything else will follow.

________

Luis Morais writes from Brazil. This article was first published on LinkedIn here.

________

Bibliography/Further reading

Google’s Vint Cerf warns of ‘digital Dark Age’
http://www.bbc.co.uk/news/science-environment-31450389

How do I address you? Forms of address in Oko – Uchenna Oyali
http://academicjournals.org/article/article1379410255_Oyali%20pdf.pdf

Is Yorùbá an Endangered Language? – Felix Abidemi Fabunmi & Akeem Segun Salawu
http://www.njas.helsinki.fi/pdf-files/vol14num3/fabunmi.pdf

An Endangered Nigerian Indigenous Language: The Case of Yorùbá Language – Temitope Abiodun Balogun
http://nobleworld.biz/images/6-Balogun_s_Paper.pdf

African Languages in a Digital Age, Challenges and Opportunities for Indigenous Language  Computing. Don Osborn, HSRC Press.

 

Raising Money: The Dictionary Experience

There are many ways to fund a project, I’ve realized. One can work hard, save up for many months, and then put all that savings into the choice project, ignoring family and other more important commitments in the process; or one can ask friends and family for a raise, promising that the money will not all go down the drain of sometimes unrealistic dreams. This is usually a good idea if they are not, at the moment, committed to something else more important themselves. Usually very rare. Or one can apply for a number of grants in the world, promising to make one’s dreams come true.

Most grants however are specific. I got a MacArthur-sponsored grant in 2005, for instance. It came with a stipend of $600 for all of six weeks, with a paid trip to Moi University in Eldoret Kenya for a “Sociocultural Exchange”. The Fulbright of four years later came with a monthly stipend of about $1200 but one had to pay for lodging, and feeding in the United States (totaling usually up to around $800) such that by the end of the program, there was just enough to buy an iPod Classic, a hand-held camera, and a few gifts for hordes of friends and family back home.

Some grants require that the grantee do a couple of things (like write a book, for instance), or stay in a particular location for a period of time. Or do work in a certain area for a period of time. In most cases, except one is already established in that field, it’s hard to find a grant that fits conveniently. That was why when sometimes last year, while pondering a way to continue and expand a project I started as an undergraduate in the University of Ibadan “A Multimedia Dictionary of Yoruba Names”, I constantly ran into a wall of doubt as to the possibility of raising enough funds (and finding enough interested people) to get the project moving. The model I had submitted as an undergraduate project was of just a thousand names borne by Yoruba children, with their meanings and (for the first time) audio pronunciations done by Yoruba speakers. For 2005 Department of Linguistics at the University of Ibadan, it was an impressive work. For a 2015 adult with access to more efficient technology and crowd-sourcing, it was less than a tip of the iceberg.

IMG_6625I didn’t have enough savings to start the project on such a scale that I envisioned, and I couldn’t think of any grants that could fund it. Even the Fulbright Alumni Innovative Fund (for past Fulbrighters), as diverse as it is, was limited to a number of categories which doesn’t accommodate a project focused on lexicography and language documentation. There is the MacArthur Genius Grant, a suitable and appropriate grant that makes no demands on the grantee but rewards them (with $650,000 over five years) to be able to achieve their dreams without the drag of a 9-5 job in a busy city. Problem was, one needed to be nominated, and the folks who nominate are usually not known to anyone but the MacArthur folks. Finally out of options, the idea of crowd-funding struck me, just as quickly as the imperative to use 2015 as a year to proceed with the dictionary idea in the first place. I’ve had some contact with Indiegogo before now, but only through friends who had asked me to donate to their project. I’d also heard of Kickstarter, GoFundMe, GlobalGiving, and a couple of other crowd-funding sites. I did a little search on all of them and found Indiegogo most appropriate. Unlike Kickstarter, they don’t send all the pledged amount back to the owners if the goal is not reached. They do take 5%-9% on all the funds raised though, which makes sense when we realize that they’re also in business to make a profit.

So, on January 6 (a not-so-smart date to start a fundraising drive, when one considers the expense that usually goes into the Christmas holiday period), I launched the Indiegogo campaign, open for 60 days. Yet, in spite of the inauspicious beginning, the idea resonated with a lot of friends, family and colleagues with whom I shared it, and they gave, surpassing my expectations. It may also have had something to do with how obnoxiously I pestered a couple of them who promised to donate and then promptly went AWOL :). More importantly, word about the project got out and many people who had nurtured similar ideas about documenting the Yoruba experience but lacked the means or network to do so wrote to me to volunteer their time and services. It has been the best part of the whole experience. There have also been other not-so-encouraging ones: colleagues who matter-of-factly expressed their unwillingness to support either because I’d never supported their projects in the past (even without my knowing it) or because they had their own projects that also needed financial attention. In all, I learn a lesson in human relation, fundraising (I wonder how politicians do it. Explains why I’d never be one), drive, and persistence.

There are now about 15 days to go until the fundraising effort is over. But yesterday, I realized that this is only a start. Yes, I do want to create a Yoruba Dictionary of Names, and the dream is now more realer than ever, thanks to a number of known and unknown people. I however also want to create a Lexical Dictionary of Yoruba containing all the words in the language, also crowd-sourced, and also multimedia and internet based. There is no excuse for the absence of such a document online and such app in mobile phones of interested people all around the world. I want to translate more work from English into Yoruba (I’ve still not completed the one I’ve been working on for years), and render more work from Yoruba into English, and into audio. I want to work with as many people as are willing to make Yoruba relevant to the next century in information technology. The industry for mother tongue education, and documentation is one that is huge and waiting to be tapped. Yes, we are translating twitter into Yoruba, but that can’t be all. Where’s Facebook? Instagram? Google? Where are machine translations? Where is Siri Yoruba? And to do all of these will take more than the $5000 that we are now on the path to raising. We need more.

Yesterday, I applied for the TED Prize 2015, a prize worth a million dollars to support any dream from anywhere in the world. A total stranger had sent me a link to it via Facebook, believing that I have a shot. I scoffed for all of one second and then sobered up. If life has taught me anything, it’s that more than hard work and persistence (which usually pays), taking a chance on oneself is also usually a good idea. I have also begun to look for any other grants that can support a dream of creating a thriving ecosystem of mother tongue education and use in Nigeria. Not just limited to Yoruba, by the way, but the over 500 languages in the country.  It might happen, or it might not, but it will not be for lack of trying. There is a future worth pursuing. From the kind of enthusiastic support I’ve seen from the Dictionary fundraising, one also within reach.

Documenting African Names

I’ve always been fascinated by names, and I can’t say since when. I’ve also always been fascinated with technology. It was no coincidence then when, while researching topics for my undergraduate project at the University of Ibadan, I settled on creating A Multimedia Dictionary of Yoruba Names. It was the first undergraduate project in the department (and, I hear, in the university) which made use of only electronic materials. There was no hard bound copy of any written material. Everything was hypertext and audio, burnt onto a compact disk. For audio I had the help of friends and colleagues to obtain fluent Yoruba speakers willing to help pronounce these names that formed the bulk of the audio database. Then, with my little knowledge of html, I designed an reference interface to access the sounds. Users could click to hear a name pronounced, or they could merely look up a name to discover the meaning.

oruko-logoI found the project stimulating  to work on, hard as the challenge of audio recording was, sometimes across two continents, but it gave me great pleasure, and something worthwhile to work on at that time when academic endeavour as a student looked like a chore with no silver linings. It helped to have had a couple of published materials to use, but researching the meaning of names proved interesting enough to mitigate the boredom of the final years of school. I got my degree, and left the university, with a niggling desire at the back of my mind to one day return to the project in a larger form. The choice was either to find money, call up some of the original collaborators and do it again, this time without the constraints of a university environment, or to simply pursue it as a solo side project. The reality turned out to be that because of other commitments, I would never be able to individually pursue it as a solo project anymore.

Time and chance has kept me in the orbit of project and vocations that relate to African (Yoruba particularly) languages and culture, and my masters thesis focused on the problems and peculiarities of learning Yoruba tone as a second language learner/speaker. Growing up as a monolingual speaker of English in Europe or America, how easy is it to learn Yoruba (or any other tone language for that matter? Mandarin, Vietnamese, Igbo etc). The result of my research yielded fascinating insights to second language learning and acquisition and I have sworn to return to that research as well at a later date. Scholars who have dismissed the possibility of monolingual English speakers to learn and master tone at a second language level will be disappointed by the challenge posed to their premature conclusion.

The reason I have been interested and involved in these things (beside the obvious one of it being in the orbit of my profession as a linguist) is my consternation at the absence of enough cultural materials online from this (Yoruba) and many other African cultures. In this century when most of what is knowable (particularly about Western culture and civilisation) is online and accessible to everyone, it is appalling that Africa seems to be left out. True, most of our cultural information are oral and thus based in the memory of griots and other living libraries scattered around our hamlets. This however can’t be an excuse to shy away from the tools of new technology to document them for future generations. Foreign media who need to pronounce names of African celebrities resort to Anglicizing them without consequences. If a Nigerian can pronounce “Krauthammer” or “Schwarzenegger” or “Spielberg” or “Reagan”, why would folks like Chiwetalu Ejiofor (as the Igbo will spell and pronounce the name) need to change their names to “Chiwetel” in order to get by in Hollywood? In this video from The Tonight’s Show, British actor of Nigerian origin, David Oyelowo tries to teach Jimmy Fallon how to correctly pronounce his Yoruba name.

In a world where linguists and cultural practitioners from Africa do their jobs well, Jimmy Fallon would have had to consult a dictionary of African names online and learnt the correct pronunciation before his guest comes on the show. He would do that for a Swedish guest, after all, no?

My life’s work then, it seems day after day (as I’ve found myself gradually gravitating towards) is to find all the ways possible to make the African experience part of the world experience, using tools provided by information technology. But not just that, it will also include making technology friendly and accessible to Africans who would otherwise have been put off by its alien language and lack of enough user-friendliness. Between 2012  and 2014, we successfully petitioned Twitter to allow the platform be translated into Yoruba. This was a huge victory for language survival and a testament to the open-mindedness of the folks at Twitter, recognizing the ability of the platform to be even more efficient in the hands of more people, and in more languages.

logo1My current project is to document ALL Yoruba names, by crowdsourcing, along with their etymology, meaning, phonetic/morphological properties, and all other stories and cultural dimensions to them. This time, we’re trying to make not just a dictionary, but a resource centre with dictionary, encyclopedic, and linguistic/multimedia information. I am raising funds on Indiegogo to meet the goal of creating the software backbone of the project and we have got a number of volunteers and goodwill. The larger aim is to kickstart a process that will lead to an awakening, and an eventual movement by all concerned, to put more effort in the documentation of our cultural experience in spite of the onslaught of a type of monolcultural globalisation that only leaves us bereft of any signpost of our identity and place in the world. Not just for Yoruba, however, but for all African languages. But we have to start somewhere.

If you believe in this dream, please go to the project page on Indiegogo to donate whatever you can. Every dollar counts.