Vive la eRevolution (Seconde Partie)

10 Feb

My post a couple of weeks ago on the eLanguage corpus, CANELC (Cambridge and Nottingham eLanguage Corpus), prompted a few questions on Twitter and other social media sites about word frequencies. I’d made the observation that despite the very direct nature of Twitter and other eLanguage, ‘thank you’ was the second most frequent two-word item in the CANELC corpus, suggesting that politeness isn’t being lost, even though economy of space means that we’re reducing the hedging and softening in our communications.

However, I didn’t want to dwell too much on the frequency tables because out of context, and without analysis, they’re pretty meaningless.That said, Professor Carter has kindly provided the slides from his talk so I can now give at least a few insights into what these frequency lists might tell us. Plus, I like lists, and it seems other people do too, so here are the top 50 most frequent single words from CANELC (click on the image to increase the size):

corpora1

In his talk, Professor Carter noted a high frequency of pronouns, which is particularly interesting when you compare the CANELC list to the 100 million word BNC (British National Corpus). Pronoun use in the spoken BNC = 1: 38; in the written BNC = 1: 200; and in CANELC = 1: 43. A clear indication  that eLanguage has far more in common with spoken language than with ‘traditional’ written language. The demonstratives ‘this’ and ‘that’ are also high up on the list, which Professor Carter feels “underlines the personal nature of most e-communication, with significant pointing to referents in the most immediate environments.”

And here are the top 50 most frequent two-word units:

corpora2

With this list we see what Professor Carter refers to as “the frequent use of temporal referents” which “allows for an immediate or near-immediate information exchange in near real-time.” So words like ‘next week’, ‘next year’, ‘this morning’ and so on are helping to create “a shared digital space rather than physical space, within which the social, physical and temporal context is frequently changeable.” And in this way, eLanguage as a ‘genre’ “behaves like synchronous communication.” So although this language is written, many exchanges are taking place in real time and, perhaps inevitably, the language is more like that we are used to hearing in spoken communication.

Going back to the single word list, you will see that the humble ‘x’ is sitting at number 38. This is because the most common closings in eLanguage are: x, xx and xxx. In his talk, Professor Carter referred to a Daisy Goodwin column in the Sunday Times from August 2012. The article  highlights how written business communications, which were once “carefully calibrated and deeply unexciting” now leave us “floundering in semantic uncertainties.” It seems an ‘x’ or two (or three) at the end of an email is not just for friends and family, they’re creeping in to business emails too, leaving some of us unsure whether flirtation has moved out of the stationery cupboard and on to email, or if we’re causing offence by omitting an x, xx or xxx from our own business communications.

These are just a few observations which barely scratch the surface of Professor Carter’s and Dr Knight’s talk, and the pilot project they conducted. We are likely to see CANELC, and similar projects, soon having a significant impact on written corpora data, and this will undoubtedly have a knock-on effect on our understanding of eLanguage and have implications for language learning and teaching in the future. In fact, CANELC data has already been added to the Cambridge Written English Corpus, the web pages for which explain in more detail how corpora are used in ELT materials writing.

Corpus banner

Want to find out more about eLanguage corpus research? Below is a table, again kindly supplied by Professor Carter, of other projects that have been initiated around the world. What’s different about these is that unlike CANELC, which took samples from a wide variety of e-communication, the corpora listed below are more bespoke, each focusing on one very specific variety of eLanguage:

othercorporaprojects
Interesting to see that one person’s junk mail is another person’s research project. Should anyone take the time to seek out information on these projects, please do post any insights. xxx 😉
Advertisements

Vive la eRevolution!

25 Jan

“We are in the middle of a syntactical and discursive revolution.”

Ron Carter (2013)
Research Professor of Modern English Language, University of Nottingham, UK

You’d be hard pushed these days to find decent, up-to-date ELT course materials that don’t claim to be informed, in one way or another, by corpora.  Digital technology allows us to gather and analyse all kinds of language data which in turn helps to inform language teaching and materials development. A couple of short blog posts from Professor Ron Carter (2011) provide a gentle and very readable introduction to corpora and corpus linguistics for anyone new to this: http://dictionaryblog.cambridge.org/2011/12/12/a-few-words-on-corpus-linguistics/

Some corpora are bigger than others

The true value of corpora informed ELT materials depends on two key elements: the nature of the corpus or corpora that have been used, and the way in which the information they reveal has been practically applied. And it’s not just about size. For example, if you have a 10o million word corpus consisting primarily of samples from written academic work by native English speakers, that’s not going to be of much use for informing an ELT book on speaking skills. Ideally, a corpus should represent data from a balanced range of ages, nationalities, gender, occupations and so on, and it must be very aware of its own limitations, some are general and some are very specific.

Texts and tweets

I was therefore curious when I heard about a ‘texts and tweets’ project being led by Professor Carter, which has been addressing the (some would say long overdue) need for a corpus that gathers, for want of a better word, ‘eLanguage’ data. It’s a pilot research project called CANELC (Cambridge and Nottingham eLanguage Corpus) and some of the inital findings were presented by Professor Carter and Dr Dawn Knight, to staff at Cambridge University Press this week.

CANELC is a one million word corpus of digitally-based communication in English. Data has been gathered from UK message boards, blogs, tweets, emails and SMS messages, between the years of 2006 and 2011, though with the majority of data coming from 2010 and 2011. The eagle-eyed amongst you may have noticed a few limitations already: UK only, no 2012 data and no Facebook. In the words of Ferris Bueller, “Life moves pretty fast”, in fact so fast these days, that by the time you think you’ve got to grips with the OMGs and LOLs, those pesky kids have invented a whole new way of communicating, LOL is only used with a sarcastic tone, OMG is lame, and you’re an old dinosaur. The issue with no Facebook data relates to consent. The chaining effect on Facebook and distribution amongst friends makes it close to impossible to obtain consent to use this data in a corpus. Similar problems exist on other social media sites.

Let’s talk about syntax

So those are some of the problems, but this pilot project has provided some fascinating insights into the ways in which eLanguage is changing the way we communicate in English, and what I felt was central to Professor Carter’s and Dr Knight’s findings was that it’s not all about new words / acronyms, or even new meanings for ‘old’ words (net, surf, windows, etc.). It’s syntax and discourse that are changing, and we’re in the middle of a revolution.

Here’s just a small sample of what the analysis of this corpus revealed:

We are seeing much more informality and ‘spoken-ness’ in written language. This is not limited to SMS and Twitter. Emails are becoming just as informal.

Politeness, softening and hedging are becoming much less common. Perhaps due to the economy of space, Twitter communication tends to be very direct. Though I did notice that ‘thank you’ is at number two on the two-word lexical item frequency list.

There’s a big increase in personal pronouns, when compared with other corpora.

Kisses are a pervasive feature of ecommunication. The presence, number or absence of kisses at the end of a message is an aspect of ‘netiquette’ that leaves many people floundering.

The rules of punctuation are pretty much suspended.

Pictures and emoticons are an essential part of ecommunication, with the visual over-riding the linguistic.

Haptic communication ‘((hugz))’ seems to be a way of bringing a physical presence to the digital world. We are also obsessed with saying where we are and what we’re doing. Or even what we’re not doing.

Modal verbs are starting to slip out of usage.

Banter, play and creativity with language are all very common.

This really is just the tip of the iceberg and opens far more questions than it answers. In essence we are seeing language that is a hybrid of written and spoken communication, one that’s constantly evolving, and one that doesn’t just exist in digital form but which is creeping into the way we speak as well.

Implications for ELT

Professor Carter made it very clear that eLanguage adds another layer (possibly even layers) of complication to the world of the English language learner, but in his opinion it complicates it for the better. Awareness of the way we use language in the digital world is becoming essential and it’s fascinating to see how this is affecting all aspects of communication. Whilst the ideas of ‘netiquette’ and eLanguage are starting to appear in some course materials, it’s unlikely that we’re going to see marginalisation of modal verbs anytime soon, and particularly when these things are still being tested in national and international language examinations. As far as I’m aware, replying to a celebrity tweet is not yet a core element of Cambridge’s First Certificate exam.

And on the topic of celebrity tweets, I was interested to hear Professor Carter cite another piece of research (I didn’t catch the source) that had analysed the language of celebrities on Twitter. It will perhaps come as no surprise to many Twitter users that choice language is prevalanet amongt celebrity Tweeters, and the UK’s own Lily Allen (now Lily Rose Cooper) came top of the swearers. That’s @lilyrosecooper for any language learners who want to learn how to curse like a sailor.

lilyallentwitter

For more information on Cambridge corpora: http://www.cambridge.org/elt/catalogue/subject/item2701617/Cambridge-International-Corpus/

For an insight into data gathering for smaller corpora projects: https://www.youtube.com/watch?v=IBAREv9ZrxA

For highlights from Professor Carter’s and Dr Knight’s CANELC talk: #CANELC live tweeted by @ericbaber (24 January 2013)

Death of the print dictionary?

12 Jan

“Like maps and encyclopedias – but unlike novels or newspapers – dictionaries are things you consult (while you’re doing something else) rather than things you read. For any kind of reference enquiry, the book really can be improved upon, and at Macmillan, we’ve taken the decision to phase out printed dictionaries and focus on our rich and expanding collection of digital resources.”

Rundell, M. (2012) Stop the presses – the end of the print dictionary www.macmillandictionaryblog.com/bye-print-dictionary

Macmillan Dictionary announced towards the end of 2012 that they would no longer be printing dictionaries. They were going 100% digital.  Many dictionary users around the world shrugged their shoulders. If there has ever been a print product in need of regular updating, and benefitting from a digital format, then it’s the dictionary. Digital dictionaries enable easy searching, audio (and therefore pronunciations you can hear as well as read) and portability. For many years dictionary users have been able to load content on to their PCs from CD-ROMs and we now have eBooks and online products with integrated dictionaries, plus a wealth of free online options and low-cost (and high-cost) mobile apps. Since 2010, winners of the popular UK TV show Countdown have received a laptop and lifetime subscription to Oxford Online, replacing the long-standing traditional prize of a leather-bound, 20-volume Oxford English Dictionary. Macmillan may have been the first to make a formal announcement, but all dictionary publishers have been going digital for years.

So the print dictionary is dead, right? Or are we getting it dead wrong?

Last week I presented details of our 2013 plans for ELT dictionary publishing at the Cambridge University Press ELT winter sales conference in Athens. These plans include a new (fourth) PRINT edition of the Cambridge Advanced Learner’s Dictionary, due to be published in Spring 2013. And the reason we’re doing this? In simple terms, because there’s still demand from our customers. Whilst digital immigrants are becoming more accepting of content in digital format, and digital natives expect it, given the choice there is still a solid market for print, and this includes dictionaries. A presentation at the same conference from a hugely successful Greek wholesaler illustrated that this is particularly true in Greece where, despite the country’s economic woes, the ELT industry continues to provide a good living for print book sellers and distributors who work effectively with publishers, and who understand what their customers want. For me, it underlined the importance of taking an objective look at the markets, listening to customers and analysing data objectively, rather than making dangerous assumptions or predictions and then applying them globally, based on my own digital preferences or customer behaviours in the UK.

So if it’s generally accepted that digital dictionaries are a good idea, why this continued interest in print? When the Macmillan announcment appeared online I put a question about the value of print dictionaries to a business English Linkedin group. Opinions were mixed, but here are some of the responses in favour of print:

“I like using print versions with my younger students, because it helps them with learning the alphabet and how and where to find a word in a dictionary. It may be old-fashioned but I still think it is an essential skill to learn.”

“I still hand students monolingual dictionaries in class as their phones only have bi-lingual ones. And sometimes just looking at a whole page (especially those with wonderful illustrations) is better than checking online.”

“There is no doubt that searching for a word in a book has a value that cannot be found in pressing a key. The imprint it makes on a person and his memory is different perhaps because it involves his human faculties. An online dictionary puts the meaning in front of you as a piece of dry information without any sense of accomplishment (effort).”

“A dictionary is a great resource … once opened, it’s difficult for a motivated student to close. I find students use printed dictionaries both at home and in the classroom – they feel comfortable with them …”

Clearly then, not everyone is in favour of dictionaries only being available digitally, and perhaps because of the particular ways dictionaries are used when learning another language, it seems the ELT community will continue to embrace print, as long as it’s available. You’ll find similar comments on the Macmillan blog.

However, the process of publishing planning is a detailed and complex one, heavily dependent on financial viability and therefore not nearly as simple as just looking at sales figures and speaking to customers and distributors. There is no guarantee that Cambridge, or any other publisher, will continue to print ELT dictionaries in the coming years and our digital alternatives (online and mobile) are already making a huge impact. For example, it is thought that the free dictionaries at dictionary.cambridge.org  are the world’s most popular online ELT dictionaries. Monthly visits are in the millions, and by allowing advertising on the site Cambridge can offer users a range of quality, up-to-date dictionaries completely free of charge. There’s also an API developer hub.

For the time being though, Cambridge will continue to offer learners and teachers of English a choice – and many are choosing to access our dictionaries in more than one way. As we stride in to 2013, it seems that the reported death of the print dictionary in 2012 was an exaggeration.

Find out more about Cambridge print dictionaries and mobile apps at www.selfstudy.cambridge.org

Free Cambridge ELT dictionaries can be accessed at dictionary.cambridge.org

Follow Cambridge ELT dictionaries, including ‘Word of the Day’:

Facebook www.facebook.com/pages/Cambridge-Dictionaries-Online

WordPress dictionaryblog.cambridge.org

Twitter @CambridgeWords

CALD4new

Let’s make Lists

6 Jan

checklist

I love lists. Both at work and in my day-to-day life, writing ‘to do’ lists means that I get things done. Without lists I’d flap around this world like a moth in a room full of lightbulbs. I would go as far as to say that lists changed my life. There are three main reasons for this (listed below):

1. Nothing is forgotten. Lists help me to get things done simply by reminding me of what needs doing and when.

2. Prioritisation. Urgent tasks are dealt with when they need to be, and non-urgent tasks can always be juggled or moved to a later date.

3. Focus. Perhaps most importantly, lists help me to think clearly, to focus on the job at hand. They take tasks out of my head and to a place that I know I can refer to at any time, meaning that I only ever need to concern myself with the present.

I used to write lists on scraps of paper and post-it notes, but the funny little Astrid character below – looks like a thumb torn off by a piece of agricultural machinery but I think he’s supposed to be an octopus – has changed all that. With this free app, all my personal lists are now stored on my Android phone, complete with dates and reminders.

astrid2

At work I have Lotus Notes and another set of lists in the Task Manager, which also sit in my Lotus Calendar. And to top it off, because I couldn’t live without lovely stationery, I usually carry a notebook and pencil, not just for list making but also for noting down books I come across, recommended films and new music, plus useful quotes, articles, websites – essentially anything that I would previously have stored in my head and then forgotten about within 24 hours.

None of this, I’m sure, will come as news to any other list advocates out there. Lists rock.

Taking it up a notch is the checklist. I’ve never felt the need for checklists in my personal life. For me, it’s sufficient to have a ‘go shopping’ reminder, there’s no need to then break it down into ‘write a shopping list, check supermarket opening times, pick up wallet, pick up car keys,  prepare topic for till-based small talk, etc’. However, checklists at work are another matter. Marketing, like many other professions, often involves following set procedures. This is not to say that flexibility, agility and creative thinking aren’t important – they very much are – but there are basics in the lifetime of a project that must always be covered, and checklists help to ensure nothing gets missed. They’re not just for new staff either. Experience can lead to over-confidence and things getting overlooked. People working in teams can easily skip tasks by making assumptions about what others are doing. Checklists – if implemented and followed correctly – ensure that everything is covered.

I didn’t realise the true value of checklists until a friend gave me a book in a pub. It was handed over with words something along the lines of “I’m halfway through this and I’m done. It’s boring and repetitive. You’ll probably like it.” I read it and loved it. It was Atul Gawande’s The Checklist Manifesto.

http://www.amazon.co.uk/The-Checklist-Manifesto-Things-Right/dp/1846683149/ref=sr_1_1?ie=UTF8&qid=1357419031&sr=8-1

To be fair to my friend, the book could safely have been reduced to half the number of pages and still got its point across. However, Gawande deals primarily with the value of checklists in the medical profession, breaking complex procedures into simple steps, and if convincing surgeons that checklists save lives means hammering the point home somewhat repetitively, then who am I (or my dismissive friend) to object? The book also brings in examples from the aviation and construction industries, and whilst marketing of ELT materials doesn’t get a mention, I’ve no doubt Gawande could rattle out a chapter or two on that if called upon to do so. Gawande found that even when presented with the evidence of improved performance, people are often resistant to implementing checklists in the workplace because they’re considered time-consuming, patronising and/or impractical. So, it’s all about careful implementation and adaptation, trial and error, monitoring and improving. A bottom up rather than top down approach. And it’s worth it. Checklists save lives.

As a marketer, I’m rarely called upon to save lives, but that hasn’t dampened my enthusiasm for lists and checklists, and the value they can bring. If you’re not a list lover already, I recommend you grab a pen and paper, your phone, your tablet or whatever takes your fancy and start writing lists. I hope you’ll be converted and find a new level of productivity. Or at the very least, learn to savour the joy of ticking off completed tasks and feeling that you’ve achieved something each and every day.

EFL Notes

Random commentary on teaching English as a foreign language

Lauraahaha

Ideas and thoughts on language, learning it and teaching it

TEFLing

Discussing, reflecting upon, thinking about, problematizing, etc, etc.

doug --- off the record

just a place to share some thoughts

Adaptive Learning in ELT

Thoughts about personalized and adaptive learning in ELT

4JB Class Blog

Find out about what we are learning at West Earlham Junior School

hughdellar

Thoughts, rants and ramblings on the teaching of English as a Foreign Language

The (De-)Fossilization Diaries

A language teacher tries to crank up his Spanish

Oxford University Press

English Language Teaching Global Blog

Leadership Freak

Empowering Leaders 300 Words at a Time

The Secret DOS

The Little Emperor Strikes Back

Tara Hunt

senior digital marketing professional. researcher. author. speaker.

elteachertrainer

John Hughes, ELT author & teacher trainer

The Steve Brown Blog

Occasional musings and rants, mostly about English teaching.

Lexico Loco

Thoughts on Marketing and Publishing, in the context of ELT

TMcDonald ESL

Learning How to Teach - The Early Years of an English Language Teacher

The Breathy Vowel

ELT, Applied Linguistics, Korean.

Sandy Millin

Technologically and linguistically adventurous EFL teacher, trainer, writer and manager

Rebecca Prigmore Photography

Thoughts on Marketing and Publishing, in the context of ELT