Thursday, May 23, 2013

Let's map Urheimats!

Above is a map of something I've never seen before: a map of all the ancient language families of Europe and Asia Minor. What does that mean for the layman and woman? Simply put, it is the ancestral homeland of every language family as far back as we can trace it. So, for instance, on the map we see Hurro-Urartian. In antiquity, there were two kingdoms - the Hurrians and the Urartians - that spoke two related languages. By comparing the two languages we can reconstruct a "proto" language. But sometimes the daughter languages migrate and move away from where their ancestral proto language was spoken. In the case of Hurrian and Urartian, they both moved away from their homeland. This map shows their original linguistic land, back in the Armenian mountains, and not where they ended up, which was further south and east. We can do this with a lot of languages of the world. Unfortunately most language maps show where living languages are currently spoken or where extinct languages were last spoken, kind of like a burial marker. But these languages came from somewhere, right? Well I put together a map of all of those homelands. How do we know how to do this?

Sometimes a group of languages are demonstrably related, and linguists can compare the languages and reconstruct the original tongue the group diverged from (like Proto-Indo-European, Proto-Uralic, Proto-Hurro-Urartian, Proto-Northwest-Caucasian, Proto-Northeast Caucasian, Proto-Kartvelian and Proto-Tyrrhenian). Other times, a language is isolated - that means it is unrelated to any others - but linguists are able to compare the internal differences among dialects to reach a proto-language as well (like Proto-Basque). Other times, we are simply stuck with a single language that we cannot recover earlier versions of (Iberian). Finally, sometimes we can look into living languages and discover that the language contains traces of ancient dead languages that whisper to us like echoes of a time long past (like in the case of Pre-Germanic, which was spoken by a people before the Germans arrived, as well as Pre-Irish and Helladic).

For the regular joe, what do the language families represent?
  • Proto-Indo-European: Their nation(s) was called the Kurgan civilization (5000 BCE or so). They broke into a lot of tribes. If you speak a language in the following sub-families then your language comes from Proto-Indo-European: Germanic (including English, Italic (including Romance languages and Romanian), Indic (like Hindi, Urdu), Iranian (like Pashto, Farsi, Persian), Anatolian (Hittite, Luwian), Greek, Albanian, Armenian, Tocharian, Balto-Slavic (Russian, Lithuanian, Latvian, Slovak...), and Celtic.
  • Proto-Basque: Ancestral language of Basque, the last living language of the paleo-European tongues. When the Indo-European and Uralic tribes invaded, all of the aboriginal languages disappeared excepting Basque.
  • Proto-Uralic: Ancestral language of the Finno-Ugrics (Finnish, Hungarian...) and the Samoyed.
  • Iberian: Dead language. Language isolate in Spain. It neighbored Basque and the two were closely linked by trade and culture, but their languages were unrelated. 
  • Pre-British: Dead language. When the Celts came to the British Isles, they came to a land already inhabited. Some of this "Pre-British" language survives in the Celtic languages.
  • Pre-Irish: Dead language. When the Celts continued on to Ireland, the settlers there had their own tongue as well. It may have been related to Pre-British, but the traces of their language in Old Irish are distinct enough that we can't say it was the same language.
  • Pre-Germanic: The Germans split from the Kurgan civilization and settled the Scandanavian terrain, intermarrying with the aborigines who taught them various maritime techniques and agricultural terms. This term isn't exactly fair, because it apparently existed in areas that were conquered by the Proto-Balto-Slavs and the Proto-Celts, so that we see cognates in their languages as well. In fact, we may be seeing some loans appear in Finno-Ugric as well (but that's debatable). Pre-Germanic simply spanned a greater geographic area than the actual German tribes that settled the majority of their land.
  • Helladic: Dead language. More than 1500 words exist in Greek from a predecessor language that was not necessarily Tyrrhenian (the other language group that interacted heavily with Greek).
  • Proto-Tyrrhenian: Extinct family. The homeland in westernmost Turkey disappeared when they were conquered by the Phrygian invasion in 1200 BCE. The survivors sailed west to Greece and Italy, becoming the Etruscans and Raetics.
  • Minoan: A language written in the Linear A alphabet. It died along with the civilization, replaced by the Mycenaeans, an early Greek tribe. Some aspects of Minoan hint at a relationship with Tyrrhenian and Anatolian but it's generally thought of as an isolate.
  • Sumerian: Dead language. Language of Sumer. One of the first written languages.
  • Proto-Northwest-Caucasian: About 1,700,000 total speakers of all its offspring languages, but none of the languages are well-known.
  • Proto-Northeast-Caucasian: Famous for being the family of Chechen, among others. Some linguists believe it to be related to Proto-Northwest-Caucasian, with whom both share an extremely large phonological inventory.
  • Proto-Kartvelian: Language family of Georgian.
  • Proto-Hurro-Urartian: A language family that died out a looong time ago.
  • Proto-Afro-Asiatic: Its homeland is barely seen here (the bottom-most part of Egypt). Ancestral language of Egyptian, Akkadian, Hebrew, Phoenician, Chadic, among many, many, many others.
  • Paleo-Sardinian: Dead language. A language that would have been unknown had not traces survived in the dialect of Italian spoken on northern half of the island Sardinia. It was also spoken in Corsica.
  • Elamite: Dead language. An isolate, though possibly a Dravidian language which would place its homeland further south and outside of the map. It was the primary language of Iran from 2800 to 550 BCE.
  • Pictish: Dead language. A language once spoken in present-day Scotland. It's weakly attested, meaning that its classification is still uncertain. It may get dropped from the map.
  • Tartessian: Dead language. Another candidate to be cut. It may be Indo-European but displays enough mystifying features to keep linguists guessing.
  • Hattic: Dead language. The language of Anatolia before the Hittites came (an Indo-European tribe). Swallowed up by Hittite.
  • Kaskian: Dead language. A language along the Black Sea's southern coastal mountain range. Possibly related to Hattic and it suffered the same fate.
  • The Tree Language: Dead language. Tricky language to map! A language so called because it gave the incoming Kurgans the names of plants. In truth, I had to do some guesswork as no one has mapped it before (to my knowledge). It had to be close to Pre-Proto-Celtic, because a few words crop up in that language; it had to be directly in the line of the Proto-Italic migration, because a huge number appear in their language; and it needed to be central enough that words appear in Proto-Germanic. In truth, while the "tree language" is an acknowledged substrate in Indo-European languages, its geographic location is uncertain.
  • The Bird Language: Dead language. A pre-Indo-European language in the North Balkans that gave us words for avians and helped us name the exotic birds we were unfamiliar with.
  • Pre-Proto-Celtic: Dead language. Distinct from the other languages that lent words into Proto-Celtic, this language is responsible for a small but important share of the Proto-Celtic lexicon. The topics range from war to agriculture. Like Pre-Germanic, it has a high frequency of geminates (a "long" consonant sound).
  • North Picene: Dead language. A sparsely attested language in Italy.
  • South Picene: Dead language. Called Picene because it was so close to its northern neighbor, but there's little reason to believe the two were related apart from proximity. Again, sparsely attested.
Not all languages have been mapped yet. I decided to map Pictish and Tartessian... for now, but I have my reservations about including them. Their linguistic heads could be on the chopping block at any moment. I have no idea if the bird and tree languages will stay. Their impact upon living languages makes their existence near-certain, but their location in Europe is anything but certain.

Bear in mind: this map is not a snapshot in time. There can be thousands of years between two language families on this map. The Kurgan civilization, which spoke Proto-Indo-European, is about 1000 years older than Sumerian, 3500 years older than Proto-Tyrrhenian, 5000 years older than Proto-Basque and Pre-Irish. This is a snapshot of the ancient language homelands (called Urheimat).

This map is a work in progress. Things may be inaccurate. Many things need to be added. Some regional boundaries may be off. If you have knowledge to contribute, please feel free to chime in!

By the way, this is the same map of language families but today. As you can see, Indo-European and Afro-Asiatic pretty much dominate. Uralic's hanging in there. White areas are a new language family, Altaic (mostly Turkish in this map) that was introduced via the Mongol and Turkish invasions. The Caucasian languages are hanging on for dear life and Basque (its territory greatly reduced) now has to compete with Spanish and French.


  1. I'm sorry, what is that dot in the middle of Russia? Also, what happened to the Semitic languages?

    1. Afro-Asiatic basically means the same as Semitic in this context.