Sunday, December 22, 2013

What's Your Accent?

If you are a citizen of the United States, it may interest you to know that The New York Times has published an interactive quiz in conjuction with the Harvard Dialect Survey. It is the most accurate test I have encountered. It pinned my accent to within 5 miles of my hometown.

If you are interested in trying it for yourself, click here.

Monday, December 16, 2013

Breaking down Creationist bad linguistics

I never suspected this blog would deal with Creationism, but it seems I can't permit a certain psuedo-scientific article go unchecked. This post will dismantle the bad linguistics behind "The Tower of Babel account affirmed by linguistics," an essay by a certain K. J. Duursma.

Duursma's piece has recieved some circulation in Christian apologist circles, such as the Theopedia. This article is not meant as an attack on Christianity or religion in general. This is an attack on bad linguistics. The article is characterized by quote mining of reputable linguists (Crowely et al; R. L. Trask; etc...), less reputable ones (Greenberg; Ruhlen), and ones of no repute (Steel). The factual mistakes and historical distortions betray the ignorance of the author and the foundationlessness of his assertions.

Secular linguists are puzzled by the existence of twenty or so language families in the world today.

Wrong. It's the very first sentence and you are wrong. There are roughly 136 language families according to Ethnologue. Saying 20 language families is grossly ignorant of the facts. The only way you could arrive at a number like 20 is if you confused the clustered families like "Papuan" and "Paleo-Siberian" language families for single language families. They are not. Families like Khoisean are just titles linguists give to refer to dozens or more of unrelated language families that exist in geographically local regions. We're off to a bad start.

The languages within each family (and the people that speak them) have been shown to be genetically related, but few genetic links have been observed between families.

Wrong. Sentence two and wrong again. There are two possible interpretations of this terrible sentence and either interpretation is incorrect:. First interpretation is that Duursma is referring to two subfamilies within a larger macro-family (such as Italo-Faliscan and Germanic families being siblings within the larger Indo-European family). That's patently false as macro-families must have incredibly strong evidence in order to be granted the designation of a genetic relationship. Second interpretation is that Duursma is referring to any two macro-families like Uralic and Indo-European. This is probably what Duursma meant and it's even dumber. There aren't genetic links observed between separated families. By definition a genetic relationship between two language families means they are part of the same family.

But still, if speech did evolve somewhere, somehow, we would expect to find that all languages are genetically related.

World languages may be related; world languages may not be. The rate of language change exceeds the amount of time that humans have been speaking. In other words, languages that are related have mutated past the point of proving a genetic relationship. Furthermore, we needn't expect that all languages must be genetically related even if we evolved. Sometimes languages are birthed spontaneously, as in the case of Nicaraguan Sign Language. The mind is capable of spontaneously creating impressively complex modes of communication even without the influence of language, and the mind is capable of making modes more complex in order to fit one's needs.

Some have therefore suggested that man evolved speech simultaneously in more than one place. This suggestion is beyond belief, considering the dangers involved in the supposed evolution of speech.

No, it's not beyond belief. As I mentioned above, people are capable of creating language ex nihilo.

Only Genesis provides a credible explanation.

Of all the theories for how speech originated and whence our language families came, Genesis is the only incredible explanation.

Central Asia and the rest of northern Asia host the Altaic family, which also contains Turkish.

There are few Altaicists today and since the 1990s they no longer represent a majority view of linguists. The simple fact is that Japonic and Koreanic families do not have even close to enough evidence to support a genetic relatedness. Tungusic as been strongly questioned as well and its position is no longer certain. Duursma is forgetting that, even if we accept the putative Altaic family, there are many, many other north Asian language families like Nivkh, Ainu, and Yeniseian.

The Pacific is host to three or four families.

The Pacific is home to as few as 20 language families but probably many, many more. The only way this sentence makes sense is if the author is counting by Greenberg's typology, which no credible linguist believes.

...the Khosian languages are spoken in the south-west of Africa.

Remember when I predicted that Duursma must be counting language families by confusing 'convenience families' for genetically-related language families. Prediction confirmed. The Khoisean language family is a grouping of convenience. Only a few of the languages within the group are actually related. Nearly all are isolates. We just lump them into a big group because they're close to each other and utilize clicks.

The result shows that 19.5% of the core vocabulary changes every 1,000 years.

No. Some lexicostatisticians had found that the Swadesh vocabulary changes at a rate of 19.5% every 1,000 years. Duursma cites Crowley et al. to "prove" his point, but that very same book demonstrates the problem with that figure just a few pages later. The number was arrived at by testing just 13 of the world's 7000 languages (11 of them were Indo-European). There is a serious problem of statistical significance and contamination due to relatedness and proximity. The simple fact is that the only thing we can definitively say about language change is that change is not constant and can accelerate or decelerate rapidly and unexpectedly.

If this is the same for all languages, it means that statistically all words in a language should be replaced within a period of about 10,000 years. 

This is actually close to the truth, but - funnily enough - it dismantles Duursma's whole point. If genetic links cannot be found past 10,000 years because of natural language evolution, then the fact that there are unrelated language families is not troublesome at all. It certainly is no evidence for the Tower of Babel. There are any number of more parsimonious explanations for the existence of 136 language families in the world. Sorry, I meant twenty.

Trask remains unsure as to how and when this change occurred.

Well, yeah. Trask wasn't an evolutionary biologist and never pretended to be. He was a very, very fine Vasconist - a linguist of the Basque language - but by no means a biologist. We need better sources than Trask to learn about the evolution of the vocal tract and speech.

Again, there is no evidence to back their view that speech evolved.

Demonstrating that speech evolved was never Trask's intention. Trask's book is an introduction to the way language works. It's to introduce the public to how societies and minds works with language. Duursma should not pick up books on Topic A (general linguistics) and blame it for not proving Topic B (the evolution of the vocal tract).

...scholars supporting monogenesis or the relatability of all languages run the risk of being branded Creationists and of therefore having their work disregarded by colleagues.

I completely agree. There is no evidence for a single ancestral tongue of all world languages and those that support such an idea are groundless.

It seems that there is little evidence to support the view that all languages evolved from one or more proto-languages.

A lack of evidence for one theory is not positive evidence for another.

The Babel account suggests that several languages came into existence on that day.

Well, unless only a few people were building the Tower in the Plain of Shinar, there should be several thousand languages created that day, at least. Right? One language for each builder? I guess Duursma could just say that God split them up into three or four language families and then they started fighting. There doesn't seem to be much Biblical evidence for that either. Oh, well, my point here isn't to debate Biblical interpretations, just the linguistic facts, so I'll move on. I just thought it was funny.

The unitary state of Indo-European languages ... [is dated at 3000 BCE].

NO. That is far too young a date. Proto-Indo-European, the ancestral tongue of all IE languages, was last spoken just north of the Caucasus somewhere around 5000 BCE at least. 3000 BCE is an extremely fringe opinion and not even close to representative of the majority viewpoint. I would say that a date of 3000 BCE is about as fringe among linguists as those who believe PIE dates to 9,000 BCE.

Wieland points out that ‘to have such close correlation’s still existing makes little sense if the migrations were as much as 11,000 years ago, as is commonly believed. From the biblical record, they would have been less than some 4,000 years ago’

Umm... okay? Well, you haven't demonstrated that a Tower of Babel event occurred 4,000 years ago, so all of this is evidence-less speculation. And don't tell me that evidence for the Tower of Babel is not the point of the article. You blamed Trask for not proving the evolution of the vocal tract when his intention was to show how people work with language. Regardless, I'm still waiting for this "evidence" I keep hearing about.

Crowley carries on to share how languages can change from sophisticated to simpler versions, and from simpler to more complex systems. He distinguishes between, ‘isolating’, ‘agglutinating’ and ‘inflecting’ languages and shows how languages change in circular patterns.

If languages evolve in a circular pattern then this implies they either don't lose complexity or they regain complexity when they come full circle. Nevermind that none of this is true. Languages do not change in overall complexity; languages become "simpler" for the native speaker and listener as a method of reducing the effort to convey messages, but the languages gain complexity with the production of novel shortcuts that are invisible to the communicants. The point to take home here is that complexity is a constant over time.

Classical Greek was a highly inflected language; it used five cases, as well as Active, Middle and Passive voice. Koine Greek was almost reduced to four cases, and the Middle voice was used rather inconsistently. Modern Greek distinguishes only three cases, but many endings have disappeared. It is a good example of van der Tuuk’s Ruin, as it is slowly becoming an isolated language.

Wow. Duursma confuses the pathway from agglutination to isolation for simplification. This says nothing about complexity. Isolating languages are incredibly complex, it just so happens that they are less complex in terms of inflection. English, for example, reduced the number of cases over time but replaced this with a complex system of word order, idiomatic phrasing, light verbs, and novel case marking (like the fearsome triple genitive that most world languages cannot easily translate in a concise manner). Duursma has no idea what he's talking about.

However, this model cannot be used to explain the origin of highly sophisticated language systems like Sanskrit and Greek.

Yes, it can. The origin of both is Proto-Indo-European and we've known about it for 200 years. In fact, it was first hypothesized by the same Sir William Jones mentioned at the head of this article. Golly.

Language change, as Crowley’s model shows, would be unlikely to produce consistent endings for the whole of the Inflecting Language.

This does not make sense.

The fact remains that the Greek/Sanskrit parent was utterly consistent...

No, it wasn't. What are you talking about, Duursma? Ever hear of Early and Late PIE? Narrow PIE? The S-mobile? The isogloss mysteries and wave theory? The shift from animate/inanimate-neuter to masculine/feminine/neuter? This article is a bad joke that keeps getting worse.

If chance, then, did not make this Proto Language, where did it get its consistency from?

First of all, PIE is a reconstructed language. That means we can only reconstruct what is shared among its daughter languages. Part of what that means is that we invariably fail to capture the full nuance of PIE, including its inconsistencies. Regardless of the fact that we already know PIE was not consistent and far from it. Why wasn't it consistent? Cause it was a normal language just like Etruscan or Tagalog: constantly in flux. If it were consistent (which is a stupid idea to begin with), it would be a conlang.

It suggests a Designer.

It doesn't suggest anything other than it was a ~7000 year old language with a number of differing dialects.

In Babel one of the groups was given the sophisticated, and utterly consistent, Proto Indo-European language.

This implies that PIE was a first language at the Tower of Babel event. Does Duursma realize that there are fragments of Pre-Proto-Indo-European that can be reconstructed which point to an even earlier stage in the language's history?

Sadly, as people in a fallen world began to use this language, it slowly began to lose some its consistency, as grammatical mistakes became fashionable.

I already pointed out that PIE speakers frequently made mistakes, such as the S-mobile, which arose from a confusion of the inflectional ending *-(V/C)s with the start of the next word. Much like a napron was confused for an apron in English, PIE confused words like -os teuros "the bull" for -os steuros. Thus we get tauros in Greek but steer in English.

The facts we observe today are consistent with the Tower of Babel account in Genesis 11, but this does not prove the correctness of the account.

Habla mucho pero dijo nada.

Since the history of languages cannot be reconstructed beyond 10,000 years, evidence for (and against) alternative views is limited.

I agree. But this undermines the point of the entire article because nothing affirming of the Tower of Babel was presented. That is what you were trying to do when you titled your piece "The Tower of Babel account affirmed by linguistics," right Duursma?

...if we take an objective look at the facts at our disposal we cannot but draw the conclusion that the Bible account has far more going for it than the alternatives, for which there is little, if any, evidence.

You didn't discuss linguistic dating and the reason why is obvious, that would be evidence against a putative Tower of Babel. Some language families pre-date a possible Tower of Babel event. Afrasiatic is roughly 10,000 years old; PIE is 7000; Algic is roughly 8-9000.

We therefore wholeheartedly believe that the findings of historical and comparative linguistics have served indeed to affirm the Tower of Babel account recorded in Genesis 11, beyond reasonable doubt

Wow, that's hilarious. Duursma failed to provide any evidence in favor of Genesis 11. All he said was that there is no evidence against his theory. Beyond reasonable doubt? Let's hear some reasonable evidence first.

Believing this account, however, requires believing in God, and the denial of the evolution theory, which suggests that all animals, humans, and even human language, arose by chance. For many, this might prove too big a price to pay, despite the evidence.

What a capstone to a paper laden with errors and inconsistencies, hilarious mistakes and ridiculous assertions.

Friday, October 18, 2013

Evaluating the Difficulty of the State Department's Critical Languages

Ranking the "best" language to learn (usefulness vs. difficulty)
  1. Hindi: 6/10 difficulty; 44 weeks; 8/10 usefulness. The lingua franca of India and a globally important tongue. It's not the easiest language on here, but compared to Korean, it's a breeze.
  2. Mandarin Chinese: 8/10 difficulty; 88 weeks; 10/10 usefulness. What can I say? Mandarin is tomorrow's language superpower, second to English. The difficulty of the language is daunting, if overstated. There are certainly more difficult languages to learn, but the orthography is a nightmare for English speakers.
  3. Urdu: 6.5/10 difficulty; 44 weeks; 8/10 usefulness. Hindi and Urdu are both dialects of the same Hindustani language, but Urdu got the shaft cause its writing system is more difficult to master than Hindi's. 
  4. Persian (Farsi & Dari dialects): 5/10 difficulty; 44 weeks; 6/10 usefulness. Two dialects of the same Persian tongue. It will get you through Iran, parts of Afghanistan, and Tajikistan - not to mention you may find speaking communities in other parts of the Middle East. It's a good one to know, for sure.
  5. Arabic: 8/10 difficulty; 88 weeks; 8/10 usefulness. A very useful language but very difficult. If you have the motivation, I say go for it; but for those of you who are trying to pick up another language for the moolah, I would advise looking elsewhere.
  6. Korean: 9/10 difficulty; 88 weeks; 7/10 usefulness. As explained below, it is probably the most challenging language on the list. It is a useful tongue, but do the rewards justify the amount of time and effort required to master it?
  7. Pashto: 7/10 difficulty; 44 weeks; 3/10 usefulness. Pashto is for the Afghan/Pakistan enthusiasts and the language geeks. 
For those interested in becoming a diplomat on behalf of the United States (a Foreign Service Officer), bonuses are granted to the applications of those who know foreign languages with political importance (generally meaning they are the official language of a state or region). Not all languages are evaluated equally. Knowledge of any language gives you a .17 point bonus - small but certainly an edge over other candidates - and the required speaking/reading score is 3/3 (non-native, fluency), which is difficult to attain but not impossible. Based on anecdotal evidence, it is believed that Spanish is the most common second language of FSO applicants.

Many applicants want to learn one of the critical languages, a short list of languages with significantly higher bonuses. Applicants are further encouraged by the laxer standards to receive the bonus, the minimum score is only 2/2 (a professional but non-fluent ability). Unfortunately for many interested without solid language knowledge, there is a good deal of language myth and hogwash around each language learning experience: namely, myths concerning the difficulty of learning any of them as a native American English speaker. This post is intended to provide a myth-free review of each of the languages for anyone interested.

Note #1: Learning any language is difficult. Just because Spanish or Frisian or Scots would be listed as "easy" does not mean learning them is an easy experience. It simply means that they are some of the easiest you could select, relative to other world languages.

Note #2: There is no such thing as a objectively "easy" language. Dispel that myth at once. Some languages are easier than others for an English-speaking adult. All languages are equally easy for a baby. For example, perhaps the most difficult languages for an English speaker, the polysynthetic Eskimo-Aleut languages of North America, are just as simple for a child to learn as learning English or Vietnamese or Afrikaans. This list is NOT an objective list of language difficulty and there is no such thing!

Note #3: The "Time" numeral is the number of years or months required, on average, to attain necessary proficiency. These averages are maintained by the State Department.

Note #4: Difficulty is, and always will be, a subjective thing. Some speakers will find themselves unusually adept at learning and employing languages using case declensions, others may find themselves better at tongues with enormous verbal complexity. The result is that a language rated 9/10 may be closer to a 4/10 for some. Take these with a grain of salt.

Note #5: In 2012, point bonuses for critical languages were amended. While non-critical languages continue to receive a .17 boost, I do not know what the point values have been set to. I will report the pre-2012 bonuses which should grant some degree of certainty.

Arabic (Afrasian - Semitic)

Pre-2012 point bonus: .5
Time: 88 weeks
Difficulty: 8/10
Usefulness: 8/10
An important problem with Arabic is the enormous dialect diversity within the language so that two speakers from opposite ends of the Arab world may find themselves unable to converse with ease. Arabic has a lot of political pull and can be great for a career outside of the Foreign Service. But Arabic is a tough language. Why?

  • Phonology: 5/10. On the one hand, its number of consonants and vowels are average, and the vocalic system is much simpler than English. On the other hand there are a few tough consonants. Arabic distinguishes between velar, uvular, pharyngeal, and glottal plosives (and some dialect varieties have epiglottal plosives - cue jaw drop), which can make mastering the phonotactics of the language a surmountable challenge. The difficulty of Arabic's sounds has been greatly overrated in the past.
  • Vocabulary: 8/10. The triconsonantal root structure of Arabic is strange but by no means peculiar. A vowel ablaut exists in fragmented form in English (sing, sang, sung, song) and the consonant roots make learning new words easier than normal. As the language is not Indo-European, thus unrelated to English, learning the roots is going to be a challenge.
  • Grammar: 7/10 Literary Arabic boasts a small number of noun cases (3) and gender (2) and declines for three numbers but spoken Arabic no longer utilizes case or the dual form. Verbs have a normal degree of conjugations compared to world languages (including 5 moods), but significantly higher than English.
  • Suprasegments: 3/10. Arabic is a mora language where the meaning of a word is determined by the length of a phoneme, this is not especially difficult for an English speaker. Stress exists but its placement is non-random and limited: not a problem.
  • Script: 9/10. This is nearly as tough as it gets. 

Mandarin Chinese (Sino-Tibetan - Sinitic)

Pre-2012 point bonus: .4
Time: 88 weeks
Difficulty: 8/10
Usefulness: 10/10
A terribly difficult orthography with a somewhat simple spoken form. The difficulty of Chinese is famous, if greatly exaggerated. Learning Chinese is a great skill outside of State. If you never get in but you learned the language, it was time well spent.
  • Phonology: 4/10. Strange to an English mouth, but entirely palatable. Remember that tone is a suprasegmental characteristic, so don't jump to conclusions just yet.
  • Vocabulary: 8/10. Non-Indo-European so don't expect to find cognates with English, except in loanwords. 
  • Grammar: 3/10. English and Chinese have something very special in common: they are both isolating languages relatively free of inflection. Because of that, English grammar maps quite well onto Chinese.
  • Suprasegments: 8/10. Tonality over a single word modifies the meaning and can make the difference between saying "cow" and "mother." If you do some travelling, you may find other Chinese speakers using different tones, but knowing Standard Chinese tones will get you anywhere you need to be.
  • Script10/10. This is as tough as it gets and Chinese is famous for it. Thousands of unique symbols requiring memorization. The aid of radicals, small marks within the symbols that hint at sound and meaning, are of use but will not save you.

Hindi & Urdu (Indo-European - Indo-Aryan)

Pre-2012 point bonus: .4
Time: 44 weeks
Difficulty: 6/10 for Hindi, 6.5/10 for Urdu
Usefulness: 8/10
Hindi is the standard Indian dialect of the Hindustani language while Urdu is the official Hindustani dialect of Pakistan. The difference between the two is primarily rooted in vocabulary and the script. Like Chinese, it has an enormous number of speakers. Lots of Indians know the tongue and non-Indians too. Related to English but distantly. Very distantly. Close to Bengali and Gujarati.
  • Phonology: 4/10. An average number of phonemes. There is a distinction between retroflex and dental plosives but a dedicated learner would find that more fun that difficult.
  • Vocabulary: 6/10. Indo-European roots but extremely divergent from English, thanks to 5000 years of separation. Colored by cultural stratification.
  • Grammar: 6/10. A case system that has been reduced from Proto-Indo-European but never easy for an English speaker. Three cases with two declension classes. Conjugation by gender, tense, number, and aspect. Split ergativity. 
  • Suprasegments: 2/10. Stress accents that can be predicted.
  • Script: 7/10 for Hindi; 9/10 for Urdu. Devanagari script makes learning Hindi difficult but fortunately the letters correspond fairly accurately to consonants and vowels. Urdu is written in the Persian alphabet, based around the Arabic script.

Korean (Isolate? - Koreanic)

Pre-2012 point bonus: .4
Time: 88 weeks
Difficulty: 9/10
Usefulness: 7/10
A major economy with a sizable number of speakers. Not to mention that North Korea overhead means there will always be a few security analyst positions available for someone with knowledge of Korean. 
  • Phonology: 7/10. The inventory is short and simple but Korean employs stiff voice, a narrowing of the glottal opening, and hollow voice, a distortion of the larynx's position and constriction of the glottis. I have been told this is very difficult for English speakers to master as they often think they are doing it correctly as they cannot hear their mistakes.
  • Vocabulary: 10/10. Unrelated to English and distinguishes between honorifics, speech level (where the status of who you speak to/of demands a particular set of vocabulary be used), and gender.
  • Grammar: 9/10. 7 cases but not defined by gender and optionally defined by number. Verbs are relatively complicated for an English speaker, able to tack on up to eight affixes simultaneously (!). While the nouns are fairly easy fare, the verbal system is a monstrosity.
  • Suprasegments: 0/10. No significant stress system, no tonality, no pitch accent. There are several non-standard pitch accents found in dialects outside of the capital. 
  • Script: 7/10. A different orthography but one that makes sense.

Pashto (Indo-European - Indo-Iranian)

Pre-2012 point bonus: .4
Time: 44 weeks
Difficulty: 7/10
Usefulness: 3/10
The national language of Afghanistan and parts of Pakistan. Strongly latched on to identity of the Pashtun tribes. Expect to be working in Central Asia.
  • Phonology: 5/10. Nothing too crazy except for a retroflex lateral flap (a funky 'l' sound) that can be mastered with practice.
  • Vocabulary: 6/10. Like Hindi and Urdu, it is Indo-European but very distant from English.
  • Grammar: 7/10. Four cases defined by gender (2) and number (2). Split ergativity. Moderate degree of complexity to the conjugation of its verbs.
  • Suprasegments: 2/10. Some argue there is a free pitch to add emphasis to a word. Nothing that can't be learned.
  • Script: 9/10. Pashto variation on the Persian alphabet.

Persian - Dari & Farsi dialects (Indo-European - Indo-Iranian)

Pre-2012 point bonus: .4
Time: 44 weeks
Difficulty: 5/10
Usefulness: 6/10
By learning Persian you could pick up either dialects and test in both. I'm not sure if you can do that. I'm pretty sure they only give you a bonus one time. In addition to having some currency in Eastern Iran, Dari Persian (which is not the Dari language of central Iran) is a co-official language of Afghanistan. Farsi Persian is the official language of Iran, and the language has many speakers in Central Asia and parts of the Middle East. 
  • Phonology: 2/10. 22 consonants, 6 vowels. Only strange feature for an English speaker is an allophonic [g ɢ] (think a 'g' further back in the throat) which is simple enough.
  • Vocabulary: 6/10. Like Hindustani and Pashto, relatedness to English is very remote.
  • Grammar: 5/10. No grammatical gender. 3 cases marked by adpositions. The role of cases has been greatly reduced since Old Persian. Verbs are conjugated inflectionally or aspectually with light verbs. Present tense verbs are highly irregular, but overall there are few tenses to master. 
  • Suprasegments: 2/10. Stress accents that can be predicted.
  • Script: 9/10. The Persian alphabet is derived from the Arabic.

Saturday, October 5, 2013

An Etymological Map of the District

The greater DC area needed an etymological map.

Okay, so it probably didn't need an etymological map, but it sure could use one. I took 30 minutes out of my morning today to have a bit of fun: mapping the greater Washington, DC area not with its current names, but names built around their historical meanings. For example takoma, as part of Takoma Park, derives from a Native American word meaning "snow-capped mountain." Some of the results were surprising (check out Suitland and Carrollton).

Actual names of cities - Etymological names of cities:

Alexandria – Defender of Men. From Greek alexein "to defend" + -andr- "man" (related to andros and anthro-).

Anacostia – Village Trading Center. From Nacotchatank Algonquin anacost- "trading village."

Arlington – Hygered's Farm. Named after Arlington, England. Anglo-Saxon records from the 800s CE list the name as Hygered-ing-tun. The suffix -ing- is a possessive marker analagous to 's in Modern English. -tun is a suffix for town but in the 800s CE it meant a stately house with farmland (Modern English -ton).

Bethesda – House of Mercy. From Aramaic beth "house" (whence the second morpheme of the word alphabet) + hesda "mercy."

Bladensburg – Sword's Fort. From a Germanic source, possibly Anglo-Saxon. Bladen "sword" or "knife" (contrast Modern English blade) + -s a possessive genitive marker (Modern English 's) + -burg "fort" (though today means something more like city or burrough).

Glenarden – Great Forest Narrow Valley. From English glen "narrow valley" + Latin arden "great forest."

Greenbelt – Rural Land Outside the City. English analogy: a belt of greenland that wraps around a city. A bit dated as nearly cities are surrounded by suburbs today.

Holmes Run – Holly Tree Run. Holme and holmes have many meanings. The one I figured was most likely is a common Middle English name meaning a place of a holly tree. The last name of Sherlock Holmes, for instance, meant "man of the holly tree."

Hyattsville – Village of the Man of High Gate. From Middle English hyatt a dialect shortening of high + gate with a 's genitive and -ville "village."

Lake Barcroft – Farm by the River Bank. From Scotch-Irish English ban- "river bank" + croft "farm."

Langley – Long Meadow. From Old English lang "long" + -lea "meadow" or "woods clearing" (contrast Ashley "meadow between the ash trees").

Marlow – Marsh Hill. From English mar- "marsh" + low "hill."

McLean – Celtic shorthand for (Saint) John's Servant

Mount Rainier – General's Adviser. A rarely employed term. A rainier was once a common position to Frankish and German armies.

New Carrollton – Town of the Slaughter Champion. From older Irish carroll "slaughter champion" + -ton "town."

Potomac River – River of Swans. Disputed etymology, so I chose the prettier one. It could also be from From Algonquin patowmack "something brought," signifying a trading post.

Pimmit Hills – Unknown

Suitland – Senator Samuel Taylor Suit's Land

Takoma Park – Snow-covered mountain. From Lushootseed [təqʷúʔbəʔ] "mother of the waters."

Washington – Wassa's Estate. Contrast Arlington. From Old English Wassa personal name + -ing- possessive suffix + -tun "estate," "farm house."

Thursday, October 3, 2013

Fantasy Tropes

While I'm waiting for the results of a small survey on the accents and vocabulary of middle age and elderly Iowans, I think we can take a moment's respite to talk about the accents used in the fantasy genre.

It's long been noticed that when directors want to demonstrate the "foreign-ness" of a character, they utilize British accents. TV Tropes calls this The Queen's Latin phenomenon. They are right on the money. British accents are an over-used technique. Is the guy from Rome circa 300 CE? Well he's speaking in Received Pronunciation now. Sometimes directors go so far as to have American and Australian actors adopt British accents, rather than cast a British actor.

But the stereotypes go deeper than just British accents in a period film. The fantasy genre is one of the worst purveyors of language stereotyping. Let's get frank, here:

Strong, brutish, and well-intentioned but not particularly intelligent
Result: Scottish accents

Cultured, sophisticated, pompous, usually good but sometimes evil or mischievous
Result: Highfalutin English accents

Human (heros)
Balanced, smart, lovable, closest to a perfect character
Result: Common English accents, General American

Human (lower class)
Unafraid to get their hands dirty in morally ambiguous situations but not necessarily evil
Result: Cockney, Irish

Human (antagonists)
Super powerful and super evil and super smart. They stand for chaos, destruction, deception, powerlust, and selfishness.
Result: General American, Common English accents

Hobbits and Simpleton Races
Generally good, down-home folk.
Result: Common English accents

Humans (assassins and mercenaries)
Strange and mysterious. Usually from a foreign land. They may speak slowly but their wits are sharp and skills unmatched by everyone but the heros and the main antagonist.
Result: Foreign accent, usually Mediterranean or Middle Eastern

Trolls and Ogres
Stupid, evil, easily tricked and hateful
Result: Cockney, Scottish

 So what we learn from today's venture is that American accents are generally ignored but Australian, New Zealand, and Canadian accents really get the boot.

Tuesday, October 1, 2013

Regional Accent Test

Review the questions, then please begin recording. You can record your answers at Vocaroo. Post a link to your Vocaroo file in the comment section and I'll reply with your "score." Remember there are no right or wrong answers, answers are only indications of regional influences upon your idiolect.

First name:
Did you move anywhere before age 18? If so, where:
[OPTIONAL] Area of college(s) attended, if attended:
[OPTIONAL] Places you lived after graduation:

Multiple choice.

How do you address a group of people? You may pick multiple.

  1. You
  2. You guys
  3. You all
  4. Y'all
  5. Yous
  6. Yinz
  7. Other: [please list]

Paul says, “I'm going to the store. Do you want to come with?” Does “do you want to come with” sound grammatical or no? Would you ask that question in the same wording?

  1. Yes
  2. No
  3. Don't know
Yolanda asks, "I called you yesterday. Why you ain't call?" Does "why you ain't call" sound grammatical or no? Would you ask that question in the same wording?
  1. Yes
  2. No
  3. Don't know

What do you call a free-standing municipal water dispenser?

  1. Water fountain
  2. Fountain
  3. Drinking fountain
  4. Bubbler
  5. Other: [please list]

What do you call your casual shoes?

  1. Sneakers
  2. Tennis shoes
  3. Gym shoes
  4. Other: [please list]

In your kitchen, what does water come out of?

  1. Spigot
  2. Faucet
  3. Tap
  4. Other: [please list]

What do you call a generic hard candy on a short, white stick?

  1. Sucker
  2. Lollipop

Short answer.

What do you call rain while the sun is shining? If you do not have a word for it, say you do not.

What do you call the largest meal of the day?

What do you call your third meal of the day?

What does a farmer milk the cow's milk into?

When people say “I'm going to the city.” What city jumps to mind?

What do you call a flavored carbonated beverage?

What's a poke of potato chips? If you don't know, say "Don't know."

Say the following in a steady, moderate speed.


Read the following sentences aloud in a natural speaking rhythm. Try not to over-enunciate and try not to speak with unusual clarity. Speak naturally as if you were talking with a friend at a restaurant.

“I had to run to catch the bus and when I got on I realized I didn't know the route it was taking.”

“My cat has several kittens to the litter. She loves the nook by the pots and pans.”

"I think the tourists will go for an orange or apple during their tour."

"The aliens understood why humans love juice but could not understand the role of a father and mother."

"You can buy mirrors from the shop on 33rd Street if you hurry."

What is your American accent?

Traditionally, linguists would say that most Americans speak with a General American accent. What is meant by General American is a very particular dialect: one of the old, boundary-less dialects of the English language that Americans picked up in order to be better understood. Most Americans describe the accent as “plain,” “boring,” and “featureless.” It's very common for General American speakers to say “they don't have an accent” (this is a myth of language peculiar to American English speakers as everyone has an accent, but that's a tale for a different time).

Calling General American a single, concise accent may not be such a good idea. Which region of America speaks the real General American? Instead, General American may be more appropriately described as a collection of habits and traits of speaking that people choose to adopt on top of their regional dialect. Put more simply, people with regional dialects often speak with “flatter” accents but never truly lose their regionalisms.

In Massachusetts, for instance, despite the fact that many Southies of Boston now speak in accent closer to something outside of New England, they still retain their regional words like the adverbial use of wicked (as in, “that movie was wicked awesome”), casual r-dropping (think “pahk the car in hahvahd yahd,” which is formally called non-rhoticism), and a sporadic intrusive r ("sodir" not "soda"). 

So what does that mean for you and me? Well, many people incorrectly assume they speak a “pure General American.” This is not really true. First, as outlined above, there is no ironclad definition of General American, as every interpretation of General American is subject to regional idiosyncrasies. Second, contact through travel and pop culture can color your speech. Third, dialects change over time, even within a lifetime. The third point is often a bit surprising, but language change is happening all the time, everywhere. No exceptions. If you travel to Ohio, for example, you will see that the English language is already in flux: speakers under 30 years of age will pronounce words like strength as shtrength. Ohio's new accent, where [s] is reduced to 'sh' before [t] is common habit of language change that linguists call lenition or consonant weakening. It may interest you that this same thing is happening in parts of England.

I thought it would be fun to play an accent game of sorts. Granted, it won't be as exciting as a real game, but it may be fun to hear your regionalisms. Because most readers will have traveled in their lives, I am guessing none of you will a particularly distinct accent. My hypothesis is that most of you will have General American accents with various habits and features picked up from the places you moved to and the people you have interacted with.

Below will be a series of questions to answer and things to speak. To complete this test, you will need to record yourself saying the answers. I believe allows you to record and upload audio messages. You can reply with a link to the audio message for your answers. Because I might get each Vocaroo link mixed up, I will ask you to state your name at the beginning, as well as short biographic information.

The overall goal is that I'll tell you what regional accent you have, as well as point out any inconsistent or unusual features to your speech. 

I have arranged a fun test for you. There are many tests online, including some very good ones, but each have their particular problems. I have designed this test for American English speakers (English speakers from the United States). So my apologies to the rest of the Anglophonic world. 

Thursday, September 26, 2013

Language isolates in focus: P'urhépecha

Language isolates are languages without any other related languages on earth. Roughly a quarter of the world's language families are isolates. Probably the most famous language isolate is Basque, spoken in Spain and France,* but there is over a hundred other isolated languages in the world. Today we take a look at a language isolate, P'urhépecha, a language spoken in Michoacan, Mexico.

Mesoamerican languages tend to share similar features, forming a geographic clade where the languages tended to rub off on each other. The result is that the languages in Central America, even when genetically unrelated, resemble each other. We call this location of mutual feedback the Mesoamerican Linguistic Area (MLA).

Traditional folk: A P'urhépechan musician plays the Pirekua, a
folk style of music, on the violin.
Image credit: UNESCO
P'urhépecha is strange because it is centered in the Mesoamerican Linguistic Area yet it is clearly shares none of the features and is seemingly immune from outside influence. The language did not recently migrate to the region, so we know there was plenty of time for P'urhépecha to pick up the mutual features of the MLA.

Like the languages of northern Canada and Alaska, P'urhépecha is polysynthetic (I wrote more about what what polysynthesis is, and how the northern languages use polysynthesis, here), yet unlike those languages, P'urhépecha cannot compound nouns. P'urhépecha involves double marking like Spanish. Most languages use cases to modify the meaning of a noun (like the 's in the sentence Paul's house is a possessive genitive case marker) and many languages use positionals to do that task (such as prepositions in English), but P'urhépecha uses both case markings and postpositions simultaneously. The verbs of P'urhépecha can be suffixed according to shape, position, or body part.

The language is spoken by roughly 250,000 people in Mexico, so we are fortunate that this is one isolate not in immediate danger of extinction.


* = Korean is often classified as a language isolate, easily making it the most famous isolate in the world, but its place is too controversial to casually classify in a post. First, some linguists include it in the Altaic language family and, second, some consider the Jeju dialect to be divergent enough to be a separate language, making two Korean languages under a Koreanic language family.

Monday, September 23, 2013


Let's keep tonight's update brief and light-hearted. 

Deep in the jungles of South America is the now-famous language of Pirahã. The Pirahã people and their language has been championed this last decade by Daniel Everett. The language is famous as having the smallest inventory of sounds in the world, just 11 phonemes, tying Rotokas of Papua New Guinea. 

Leisure in eleven sounds or less: Daniel Everett and a Pirahã man enjoying a dip.
Image credit: The New Yorker. 
On the other end of the extreme is !Kung, spoken in the deserts of Namibia, with a whopping 141 phonemes. No wonder the language gets an exclamation point in its name! (Actually, the exclamation point stands for a click phoneme resembling a cork being pulled from a bottle). For comparative purposes, English dialects range from 38 to 44 phonemes, which is above average but by no means extraordinary. 

Labor in one hundred and forty-one sounds: 18,000 people speak !Kung today but their way of life
is endangered due to pollution of their water sources.
Image credit: Documentary Educational Resources.

From Old Chinese to Modern

It's no secret that tonality in Mandarin is a fairly recent phenomenon. Old Chinese, the ancestor to Modern Mandarin, had no tones. Instead, suffixes at the ends of words were reduced into rising and falling tones, and then diversified into several different kinds.

This tends to surprise most, as people tend to assume that tonality is a very complex feature - and that complex features are probably older, right? The reverse is true. The creation of tone changes was a labor saving device that made clear communication possible in less time.

Excitingly, a linguist of Mandarin has taken a sentence and mapped its evolution in pronunciation, starting with Old Chinese of 1200 BCE and progressing step-wise with each sound change till the modern era. Take a listen:

Thursday, September 19, 2013

Who wants some PIE?

What did Proto-Indo-European sound like? It's a fascinating question. The reconstructed language is one of the most important languages in human pre-history - yet we are able to hear what it sounded like with some accuracy roughly 7000 years later! Historical linguistics enables us to hear with our ears the world of yesteryear.

Happily, linguist Andrew Byrd has made several audio readings of fables in Proto-Indo-European (PIE). PIE is probably the most fully reconstructed PL2 today (or PL3 if you get really pedantic, since Latin < Proto-Italic < Proto-Italo-Faliscan < PIE). Anyway, enough of my yakking.

Wednesday, September 18, 2013

A List of the Ancient Languages of the British Isles

We begin today's post not in the tomes of linguistics, but in the history of economics. We take a step back to over 150 years ago when, in 1846, the British parliament repealled the corn laws. The corn laws established pricely tariffs on corn imports from cheaper foreign sources, and were significant barriers to trade. British citizens had to purchase corn from more expensive domestic sources. But why would the crown impose such a heavy burden upon its own subjects?

In truth, the corn laws were the product of the economics of the Late Middle Ages, when wealth was conceived as a limited thing. If you get a dollar that means your enemy did not get that dollar. The theory is fundamentally flawed. Why define wealth as a limited quantity and why define it solely as money/gold? Wealth can be dollars, sailboats, and sheep, and just because I bake 10 loaves of bread does not mean my enemy magically does not get 10 loaves. Fabrication of goods creates an absolutely greater quantity of wealth in the world where everyone wins.

The repeal of the corn laws signalled the end of the antiquated philosophy of wealth in Britain and the start of economic liberalism. Economic liberalism viewed the market as driven by supply and demand, and looked favorably upon trade for goods and services. Generally speaking, barriers like taxes on trading goods and services harm a nation, not help it. When you force Britons to spend more on corn in order to keep your money within the British empire, you are wasting productivity - which ultimately costs you more wealth than you saved. When you calculate wealth as more than just money, it's clear that the corn laws forced Britons to spend time growing corn when they could be making products they had a comparative advantage over other nations. 

The repeal of the corn laws also began a new era of immigration for Britain. Limiting the movement of incoming and outgoing workers tended to impede the production of goods. With the rise of immigration and the movement of laborers, however, we see the end of British languages and dialects. Many regional speeches of the British Isles were replaced with standardized speech. We also see the number of Celtic-language speakers drop like a brick in the 19th century.

Today's post addresses the languages of Britain before the 19th century (meaning we include Ireland). There were many of them, but I have never seen a full list in a single place. This list may not be exhaustive. There is no single source for all the historical languages in one place, and compiling the first of such lists risks incompletion. But without further ado, let's begin in chronological order:

Pre-Irish: The language of Ireland before the Celtic invasion. We have fragments of their forgotten tongue in the loanwords and toponyms of the Irish Gaeilghe

The Insular Celtic Languages: Celtic tribes moved into the British Isles some time after 2000 BCE. Their languages became
  • Common Brittonic: An ancient form of Celtic, it was spoken in England and Wales during the Roman Empire. As all languages will do, given time and geographic separation, Common Brittonic split up into Welsh, Cumbric, and possibly Pictish. It is better to think of Common Brittonic as an ancestral version of those languages, not as completely distinct, much like Latin is the ancestral version of the Spanish, Italian, and French languages.
    • Welsh: Spoken in Wales. Very much alive but not thriving.
    • Cumbric: Spoken in southern Scotland and Cumberland and related to Welsh. Extinct by the 1100s. 
    • Pictish (?): Controversial status as a Celtic language. The Picts were one of the chief reasons for Hadrian's Wall. Pictish was probably Celtic, but sparse records makes the job difficult. Spoken mostly in Scotland and possibly northernmost England. It would have been replaced by English, Scots, and Scottish.
    • Kaale: Spoken in Wales by the local gypsys, it was a divergent form of Welsh with heavy influences from Romani, but has since gone extinct.
  • Irish: The language of Ireland. Ancient Irishmen also settled Scotland at an early date and their speech became...
  • Scottish Gaelic: Spoken in Scotland. Scottish and Irish are closely related to each other, but distantly related to Welsh.
  • Cornish: Spoken in Cornwall, England, by about 600 - 3000 speakers.
  • Manx: Spoken in the Isle of Man. A close cousin of Irish and Scottish. Near-dead by the 20th century, there are about 100 native speakers left.
  • Shelta: Spoken in Ireland by roughly 86,000. Its foundation is the Irish language with heavy influence from an unknown source.
  • Beurla-reagaird: Like Kaale, it is a gypsy language. Its foundation is Irish with heavy influence from Romani. Extinct or near-extinct.
The West Germanic Languages: Germanic tribes invaded the British Isles beginning in 450 CE and immediately began displacing and replacing the local Celtic languages. At this juncture, Pre-Irish had been fully eradicated by Irish. The largest of the tribes to arrive were the Jutes, Angles, and Saxons. They spoke dialectical forms of a very late West Germanic tongue we somtimes call Anglo-Frisian. The Anglo-Frisian dialect in Britain evolved over time. It became
  • English: Need one say more?
  • Yola: Spoken in Ireland until the 19th century. Now extinct.
  • Fingalian: English and Fingalian diverged at a late date sometime after the Norman invasion, thus it was closely related. Spoken in Ireland. Extinct by the 19th century.
  • Scots: About 1.6 million speakers, including 100,000 native speakers. Spoken mostly in the lowlands of Scotland. 
Norn: After the Angles, Jutes, and Saxons came and formed the Anglo-Saxons, the Vikings began to pillage and settle Scotland, Ireland, and northern England. Unlike the Anglo-Saxons, which spoke a West Germanic language, the Vikings spoke a North Germanic language (both North and West Germanic had split thousands of years prior). The Scottish Isles were so thoroughly conquered that their tongues were replaced by a Viking language, which became Norn. It was closest to Norwegian and Icelandic. Norn was replaced by Scots, Scottish, and English after the 15th century. Extinct.

The Romance Languages: Following the Norman invasion, Latin-based languages began to heavily influence England through the court. The royal elite spoke what the King and Queen spoke, which were, mainly, Romance languages until the 15th century. They were
  • Anglo-Norman: When the Normans conquered England, the Normans spoke a variety of Norman that was heavily influenced by French dialects like Picardy. Norman itself is closely related to French. The result was a unique amalgam.
  • Norman: Within the United Kingdom it survives on Alderney island, part of Guernsey, and on Sark.
  • Sercquiais: Thanks to tax loopholes, the English wealthy have all but demolished this language through migration and displacement. It is spoken on the isle of Sark by about 15 people. 
  • French: Gradually replaced Anglo-Norman due to France's shift of power to Paris, which spoke Parisian French.
  • Occitan: Briefly spoken in England thanks to the influence of Edward the Black Prince, Eleanor of Aquitaine, and others. It is distantly related to French and Anglo-Norman and is closer in heritage to Catalan. It is now an endangered language, spoken in France and Spain.
  • Guernesiais: A variant of the Norman language that survived on the island of Guernsey. About 1,000 speakers left and almost all are over the age of 65. Destined to die within 40 years, I reckon, unless there is a fundamental shift in the culture.
  • Alderney: Spoken in Alderney, part of the Guernsey islands. Last speaker died in 1960. 
Yiddish: A West Germanic language with heavy influence from Hebrew. It arrove significantly later than the Anglo-Saxon invasion, with the influx of Jewish immigrants. The number of speakers is in significant decline with the return of Hebrew and the influence of national languages. 

Anglo-Romani: The language of the Gypsies, who left India some time after 1000 CE and arrived in Western Europe sometime in the 16th century, was Romani. The version of Romani spoken in England became Anglo-Romani. Number of speakers is unknown.

Thus concludes the list. We began at a vague date, some time after 2000 BCE with Pre-Irish, and ended with the introduction of Anglo-Romani in the 1500s CE. I count 24 languages of the United Kingdom and Ireland, not counting Common Brittonic. There are 11 or 12 languages no longer spoken in the British Isles, depending if a speaker of Beurla-reagaird is discovered; of them, six or seven are extinct, again with the possible exception of Beurla-reagaird.

This list also underlines the need for language preservation efforts and a change in language attitudes. The language of Guernesiais, for example, has a robust number of speakers, but only 0.1% of the Guernsey young know the language. Thus, it is nearly moribund, destined to die with the elderly. According to the United Nation's World's Languages in Danger project, many of the above languages are threatened by extinction in the next 100 years.

Not in Danger: English, French, Occitan, Norman
Vulnerable: Scottish Gaelic, Scots, Welsh
Endangered: Romani, Irish, Yiddish
Endangered, Severely: Guernesiais, Guernsey Norman
Endangered, Critically: Manx
Unclassified: Cornish, Sercquiais, Shelta

 Thank you for the adventure. I had a lot of fun compiling the list.

Thursday, September 12, 2013

How American became the official language of Illinois

Washington J. McCormick, a former lawyer with degrees from Harvard and Columbia, was elected to the United States Congress in 1921 to represent the great state of Montana. But when McCormick failed to win re-election in 1922, he decided to spend his lame duck months proposing unpopular legislation, and as a Republican with a strong independent streak, Representative McCormick proposed that these United States of America adopt American as its official language. As quoted in The Nation, he argued:
Drop your swagger-sticks: Representative
W. J. McCormick of Montana's 1st District is in town.
Image credit: Wikipedia and Library of Congress
I might say I would supplement the political emancipation of '76 by the mental emancipation of '23. America has lost much in literature by not thinking its own thoughts and speaking them boldly in a language unadorned with gold braid. It was only when Cooper, Irving, Mark Twain, Whitman, and O. Henry dropped the Order of the Garter and began to write American that their wings of immortality sprouted. Had Noah Webster, instead of styling his monumental work the "American Dictionary of the English Language," written a "Dictionary of the American Language," he would have become a founder instead of a compiler. Let our writers drop their top-coats, spats, and swagger-sticks, and assume occasionally their buckskin, moccasins, and tomahawks.
The bill failed.

Nevertheless, McCormick's firey passion stoked some independence coals in Illinois' congressmen. By the end of 1923, Illinois had hastily adopted "American" as the official state language later. Thus it was secured that the official language of Illinois is American, and not English or any other foreign tongue.

In many blue states, it is contentious to even adopt an official language. Many liberals find it a bit hypocritical, very ignorant, and terribly prideful, considering none of us learn the real American languages like Oneida, Algonquin, or Navajo. But Illinois found a way to look past that.

Sadly, the bill was amended in 1969 and the nomenclature was changed to "English." Kind of funny and sad, in a weird sort of way, that Illinois lost a very... how shall I put this... unique legal perspective on language.

Thank you, Representative McCormick.

Read the 1923 Draft of the bill at the Language Policy archive!

Read further discussion of Illinois' policy at the PBS' transcript of Do You Speak American?

Wednesday, September 11, 2013

A linguistic survey of India

Thanks to Language Hat for pointing this out. The New Linguistic Survey of India, a massive six-year, $100 million project has finally concluded. The task documented 780 languages and will be published in 50 volumes. Whoa. Click the link to read more.

Language and Economics: A Review of Chen's Tense and Savings Theory

Part I. What this is all about.

Several months ago, an economist and professor at UCLA, M. Keith Chen, published a paper in the American Economic Review, which argued that countries with languages that are future tense dependent (where they necessarily must speak about the future using tense) save less money. Or in simpler words, if you speak a language that talks about the future and the present in the same tense, you're more likely to save your money. English is a language that tends to invoke a future tense. You can say I will go to the bank later but not I go to the bank later.

Here is an article in The Atlantic about this.

Here is a video for non-linguists and non-economists:

Here is Chen's article (pdf warning).

Part II. Reactions.

Chen's study involved 39 languages, only nine of which were categorized as non-future tense dependent. Many of the nine have interconnected economies. More frustratingly, Chen categorizes languages based on popular consensus, not linguistic consensus. Danish and Swedish are dialects of a West Scandanavian language clade. Basque was categorized as one language when there are definitely two language divisions between east and west, and in reality there are probably about 20 Basque languages. The idea that the Basque language consists of simply different dialects is not a reflection of the current state of Basque today; two speakers of very divergent Basque dialects (read: languages) will have a much harder time communicating than a Danish and Swedish speaker.

But back to the problem of interconnected economies. Danish, Swedish, and Finnish utilize the Scandinavian economic and fiscal models while Danish and Swedish are dialects of the same language. This artificially inflates the N-sample and makes the report look statistically more significant than it should be.

Chen's characterization of language was awfully simplistic. English can say I am going to the store (later) in which case the sentence is free from a present-future distinction. Chen's paper, however, acknowledges these nuances in language but makes no attempt to distinguish them. Such a feat would be Herculean.

Some have pointed to Chen's paper and said, "Well this was true even within countries! A French speaker in Switzerland saves less than a Swiss German." While Chen did find that was true, his findings were not statistically significant, so the point is moot.

This was a surface examination of Chen's piece based on a single read-through. Subsequent reads might uncover more problems or they may justify Chen.

Part III. Last comments.

I was fairly disappointed with Jason Merchant's comment on Language Log. Merchant says,
Because Chen did not control for cultural factors though, it remains at best a supposition that language, and not the cultures of the people using them, are responsible for the savings and other behavioral differences found.
Wow. Did he read Chen's article? Control variables included legal inheritance and Family Values survey findings. You may say that the control variables were poorly chosen (what is the legal origin going to do with the savings rate?) but you can't say that he wasn't controlling for cultural factors.

Wednesday, September 4, 2013

The meaning of names from mythology

I thought a fun, easy read for today is in order. Myths often contain hidden metaphors in the names of characters. However, as names tend to change more slowly than the rest of language, the meaning of a name is forgotten: lying dormant in wait of a linguist's re-discovery. Today I'm going to list the meanings of many names of people and beasts from popular legends. This list was compiled with help from the Online Etymological Dictionary.

Hercules: Meant "Hera's glory." Kind of odd since in the stories Hera is the enemy of Hercules.

Eve: "A living being." Douglas Harper quotes the linguist Robert Alter as suggesting the name Eve may have been an ancient play on words, as the name "sounds suspiciously like the Aramaic word for serpent."

Icarus & Daedalus: Icarus' name is lost but Daedalus meant "the cunning worker." Daedalus built the horrifying labyrinth of King Minos that housed the minotaur. He tried to escape imprisonment with his son, Icarus, by creating artificial wings. In their flight out of the prison, Icarus' pride led him to fly to high. The sun's heat melted the wings' glue and he plummeted to his death.

Moses: Unknown but probably a Hebraization of the Egyptian language mes "child." The explanation that the name means "drawn from water" is not tenable; the semantical confusion probably represents an ancient similarity in sound between mes and Hebrew mashah "he drew out."

Mercury: "Merchandise." Mercury was originally the god of tradesmen.

Beowulf: "Bear." Literally "bee-wolf."

Kriss Kringle: Originally the name referred to baby Jesus, not Santa Claus. Literally "Christ child."

Lazarus: "God has helped." A very metaphorical name indeed.

Mimir: Norse giant Mimir is a Germanic element meaning "memory." Mimir guarded the Well of Wisdom.

Friday, August 23, 2013

Update, 23 August

I apologize for my delinquency. I am currently writing an FAQ for historical linguistics over at Reddit which may answer a number of your common language questions - aside from being interesting in itself. Between that and the daily grind, I haven't had much of an opportunity to post here. Here are some facts about demonyms, names for people groups:
  • The word Jew is a corruption of a Greek word for Judean (person from Judah).
  • The original English word for any Germanic person (including Anglo-Saxon people) was dutch. Today it only means a particular Germanic culture in Denmark but its original sense is preserved in Pennsylvania Dutch and in the loanword Deutschland (Deutsch obviously being a cognate with dutch).
  •  Spanish gringo is a distortion of Spanish Griego "Greek," which several hundred years ago was a Spanish epithet for any foreigner.
  • Welsh originally meant "foreigner" or "stranger" (a word originally for any Celtic speaker). It gradually acquired a sense of inferiority which is preserved in words like welsh rarebit (a lower-class dish made of rabbit meat).
  • A person from the Isle of Man is called a Manx.
  • A Hittite to a linguist was a person who lived in Anatolia and spoke Nešili (but we usually just call the language Hittite for convention). But surprisingly, no one knows if the Hittites of Anatolia were the same Hittites we find in the Bible. When the Bible lists the names of Hittites, like Uriah, husband of Bathsheba, the names do not read as Hebrew or Nešili.

Monday, August 19, 2013

Four Features We Don't Have in English

This video is pretty educational. It points out four aspects of world languages that tend to blow the minds of English speakers. Two of my favorites:
  • He points out that English is necessarily time-sensitive, where the concept of when something happens must be included in a sentence
  • Our directions are relative to the speaker rather than absolute (left/right vs. north/south)
He does get a few things wrong (for instance, there are clusive languages in Europe, like Chechen), but the problems are minor quibbles. Enjoy.

Thursday, August 15, 2013

Myths of language 2: Shakespeare's English

Thanks to his theater friends and the publication of a First Folio, William Shakespeare's wit was preserved for future generations and for the benefit of world literature. But how did Shakespeare speak? Close your eyes and imagine what a Shakespeare production sounds like. A sing-song quality? A cultivated accent? The thin refined speech of a London aristocrat in a Disney movie?

Nay. It may surprise you that the accent of Shakespeare was rougher, thicker, imbued with chopping block phonology of Early Modern English. Upon hearing Early Modern English accents, American ears may confuse them for a North English or Scottish accent whereas English ears may pick up American speech patterns. That's because it is a myth perpetrated by the theater and literature communities that Shakespeare spoke in what is today considered a modern, cultured British accent.

The myth is somewhat absurd if you think about the simple facts of language evolution. All accents in the English language descend from older English accents that had diverged over time. When settlers came to the colonies in the future United States, Canada, Australia, etc..., they took with them their own accents. Some of those accents did not survive back home in England; others did not survive in the colonies. After 400 years of divergence, it is more suitable to think of most accents as various children of Shakespeare's speech, each with their own peculiar innovations and conservative features.  

Here is the movie Shakespeare in Love, which uses Received Pronunciation (RP) and Modern London accents. The film never graces the true talk of the year 1600; Shakespeare would never have spoken like this. This is the common accent of Shakespearean theater today and likely the only theatrical Shakespearean accent you've come across (unless you're dealing with a play like Macbeth which would be performed in a Scottish accent anyway).

And here is a short video on producing theater in Original Pronunciation (OP) - the accent most closely approximate to Shakespeare's. 

So why is it so hard to find a performance in OP? One of the main reasons is that an OP production is difficult every way you look at it. It requires that the actors practice a long-forgotten phonology; it requires that theatergoers attune their ears, it can be difficult to understand OP speech at first. The accent is foreign and strange and there does not seem to be much demand for an Othello in OP, and yet in my opinion, it's those hidden qualities that make OP so much more interesting.

Tuesday, August 13, 2013

Rare Varieties of North American English

General American is the name for the generalized American accent spoken by most people in the United States and by a good many Canadians. There's no single General American because nearly all speakers have colored their speech with their regional accents and word choices. A General American speaker from Chicago may say gym shoes for all casual shoes; another from Boston may say a great movie was wicked awesome. But there are many dialects of North American English that are rare and unusual, featuring a prosody and word choice that will surprise you. Let's take a look at some of the more peculiar speeches of North America, shall we?

Appalachian English

Spoken among the Appalachian Mountain settlers. It is a unique "Southern" descendent of Scotch-Irish hillpeople. The tongue is often parodied (quite poorly) by Kenneth in 30 Rock.


Tangier, Virginia is a tiny island out in the Atlantic Ocean. It was settled during the Reconstruction period of England, and its dialect has taken a unique turn. The island is a popular target of language tourists but the dialect dying fast.

Boston Brahmin

Everyone knows about the typical Boston accent, which in my opinion is more appropriately described as an accent clade along the eastern Massachusetts coast. The famous Boston accent is usually stereotyped as a Southie or lower class accent. There remains among the very elderly of Boston's elite the Boston Brahmin accent and it's quite different.


The accent of Newfoundland, Canada, is one of the most divergent widely-spoken accents of North America.

West Jamaican

Monday, August 12, 2013

Myths of language, the Eskimo and their snow

Today let's have some fun. This will be the first in a series of posts on language myths.

Eskimos have 30, 40, 50, n words for snow. 

This one is pretty endemic to Canada and the United States and it's plagued with problems. There is no single Eskimo language; there is an Eskimo-Aleut family of languages. None of the languages have an unusual number of words for snow, and most have just a few. Unlike other myths we'll look at which are usually rooted in racism or social bigotry, this myth was probably an innocent misunderstanding of how Eskimo-Aleut languages behave. Eskimo-Aleutian tongues are highly polysynthetic

In linguistics, synthesis is the ability for a noun to change meanings based on morphemes. The word dog can undergo synthesis and pluralize, becoming dogs thanks to the morpheme -s. Synthesis is rare in languages like English and Mandarin, so we call them isolating languages. German and Japanese are mildly synthetic, meaning they are near the world average use of morphemes. Georgian and Hungarian are highly synthetic. Eskimo languages are polysynthetic, meaning they put Georgian and Hungarian to shame. Polysynthetic languages boast the ability to compound enormous sums of morphemes onto a single noun, so that you can make an entire phrase out of one noun that would normally take an entire sentence in English! Suddenly the Eskimo words for snow just got a lot more interesting.

A Yupik family. In truth, the word eskimo is a racist epithet
borrowed from the Algonquin. Eskimo is an Algonquinism for
"raw fish eater," an untrue insult among North American
tribes... but a compliment in a Japanese sushi bar.
Photo credit: AmazingRadio.
A classic example from Eskimo-Aleut languages is Yupik's single noun tuntussurqatarniksaitengqiggteuq "he/she had not yet said again that he was going to hunt reindeer" (credit to Wikipedia on how to write that out). It is comprised of the noun stem tuntu- "reindeer" and a series of morphemes (morphemes much like the English plural -s): -ssur- "hunt," -qatar- "going to," -ni- "say," -ksaite- "did not," -ngqiggte- "again," -uq "he," "she." Only the word tuntu- has any meaning on its own, much like dog can have a meaning apart from -s but -s is dependent upon dog. 

Polysynthesis is not exclusive to Eskimo-Aleut languages. The phenomenon can be found in Australia, Asia, North and South Americas, Europe, and the Pacific Islands (Oceania). Only in Africa are we missing a polsynthetic tongue, and to be honest, there are many African languages that have not yet been adequately studied.

I hope this journey into myth and synthesis was as interesting for you as it was for me. Picard out.

Saturday, August 10, 2013

Blevins, Basque, and Proto-Indo-European. Part II.

"Extraordinary claims require extraordinary evidence." ~ Carl Sagan 
So by a happy accident I ran across some cursory notes of Blevins' Basque-PIE talk. I discussed elements of the talk here. A number of days ago, linguist Juliette Blevins made a surprising argument: Basque is related to Proto-Indo-European. Not that Basque descends from PIE, but that Basque and PIE share a common ancestor, and that Proto-Basque sound correspondences with PIE reveal a Proto-Indo-Vasconic ancestor. You can acquaint yourself with the notes here (scroll down a ways).

Because the notes are simple summaries, probably done in haste, they contain a sketch of the argument but lack any real evidence, substance. A shame because some of these proposals are so radical, they would need significant evidence to convince me. I proffer my own thoughts.
  1. "Five problems with this proto-inventory that Juliette Blevins is going to fix." Why are there problems with Michelena's or Lakarra's reconstructions? Why can we not be content with the reconstructed phonological inventories thus far?
  2. "Problem 2: VhV sequences. (where vowels are identical) e.g. behe, bihi, lahar, luhur, mahatz, zahar." In many of these lexemes, the question should not be "Why are there VhV sequences?" but rather "Why are there VnV sequences?" An intervocalic */n/ was nasalized, then lost, unless the /n/ was assimilated (assimilation occurred frequently in words with alternating vocalics). A more interesting theory could be that nasal */n/ became /h/ if bookended by identical vowels and became /ɲ/ if the vowels were different. A word like mahai would come from *manai.
    The ruins of the Proto-Basque. Ancient Basque colmechs can
    be found throughout the hills and valleys of Navarre. This
    picture has no purpose. I just thought an image would be
    a welcome break.
  3. "To our knowledge, no one has attempted to expalin why h is so frequent and why it has wider word-internal distribution than most of the orial stops. We will solve this problem by positing two new historical sources for h." Well, first, Michelena accounts for the presence of intervocalic /h/ as the product of a loss of the intervocalic /n/ (above). Second, it is true that aspiration is very common in some Basque dialects, and was probably very common in Proto-Basque. Because many of these h-words have forms with /g/ and /k/, I've often wondered about an earlier system of *k, *kC, *kV that can no longer be recovered (such as PB *harri and its possible relationship with European *kar- words for stone). Third, it's not true that no one has attempted to explain the frequency of aspirants; Michelena believed them to have arisen from suprasegemental accent changes.
  4. "We propose only a single *s in PB." Proto-Basque's phonology was likely influenced by Iberian (both lacked an /m/; both boasted laminal, apical s; identical 5-vowel inventories; etc...). To what extent the sibilants were the product of Iberian influence or were an in-house movement is unknown.
  5. "We [are]proposing *m." Sorry, what? You already scared me by positing no sibilant contrast. Now you're saying Basque had an /m/? I'm beginning to lose the faith, father. You're going to need some powerfully good reasons. If there was no *b > m evolution, then why are there relics of an earlier *b in toponyms (mendi "mountain" but Auzpendi).
From what we have so far, this isn't even close to enough to convince me. On the other hand, it's just a summary from a talk, so I'll still give her the benefit of the doubt. If she writes on the matter, I hope - nay, expect - a significant body of supporting evidence.