toki! nimi mi li jan Elik.
What I just said there was “Hi! My name is Erik,” in a language called Toki Pona. Toki Pona is, by all definitions, a language, with its own words, grammar, phonology, and common sayings.
What you may be surprised to learn (or maybe not, because articles have titles that reveal too much) is that Toki Pona is not spoken in any particular nation or country. It was created completely from scratch by Toronto-based linguistics professor Sonja Lang in 2001. Right now, you’re either thinking “What! That’s crazy! How?!? How do you make all those words!?” or “Oh yeah, it's just like that guy who made Esperanto, Lojban, Dothraki, High Valyrian, Sindarin, or Klingon.”
The truth is that building a language, or “conlanging” (derived from the word conlang, meaning a constructed language), can be a fun and interesting hobby once you know what you’re doing. And as such, I have joined forces with Isaiah (who has never made a conlang in his life) to create a step-by-step beginners guide to help you construct your very own language (first three steps this year, more at the beginning of next year. This article is already way too long).
What you may be surprised to learn (or maybe not, because articles have titles that reveal too much) is that Toki Pona is not spoken in any particular nation or country. It was created completely from scratch by Toronto-based linguistics professor Sonja Lang in 2001. Right now, you’re either thinking “What! That’s crazy! How?!? How do you make all those words!?” or “Oh yeah, it's just like that guy who made Esperanto, Lojban, Dothraki, High Valyrian, Sindarin, or Klingon.”
The truth is that building a language, or “conlanging” (derived from the word conlang, meaning a constructed language), can be a fun and interesting hobby once you know what you’re doing. And as such, I have joined forces with Isaiah (who has never made a conlang in his life) to create a step-by-step beginners guide to help you construct your very own language (first three steps this year, more at the beginning of next year. This article is already way too long).
Step 1: Pick sounds
But from where? The International Phonetic Alphabet (IPA) of course! What’s that? Well, to answer that question, let’s go back to Paris in 1886.
The field of linguistics was a mess. People who spoke different languages had no way of writing down how to pronounce anything! The poor linguists of that age had to rely on messy, unstandardized, ad-hoc systems that were at best understood by a handful of people who only spoke Indo-European languages anyway, and sometimes by nobody at all! Some French and British linguists and language teachers decided that someone had to fix this mess and created a system where one symbol will ALWAYS correlate to one sound, no matter what. They chose to use symbols of the Roman and Greek alphabets (perhaps a bit Eurocentric, but whatever), and flipped them upside down or added little tails to them to represent every single sound that humans make in language. Here’s what they came up with:
But from where? The International Phonetic Alphabet (IPA) of course! What’s that? Well, to answer that question, let’s go back to Paris in 1886.
The field of linguistics was a mess. People who spoke different languages had no way of writing down how to pronounce anything! The poor linguists of that age had to rely on messy, unstandardized, ad-hoc systems that were at best understood by a handful of people who only spoke Indo-European languages anyway, and sometimes by nobody at all! Some French and British linguists and language teachers decided that someone had to fix this mess and created a system where one symbol will ALWAYS correlate to one sound, no matter what. They chose to use symbols of the Roman and Greek alphabets (perhaps a bit Eurocentric, but whatever), and flipped them upside down or added little tails to them to represent every single sound that humans make in language. Here’s what they came up with:
Before you slam your laptop shut from just looking at those charts, I promise they’re simpler than they look, so just bear with me for a second.
Let’s start with the consonants for now, as they are relatively easy to understand. The columns on the top place of articulation (where you make the sound) and the columns on the left represent the manner of articulation (how you make the sound). Places of articulation are arranged from front to back of the throat, and most of them are pretty self explanatory.
The sounds that might be harder to get are the alveolar sounds, which are made on the alveolar ridge, which is just the ridge that sticks out in the top of your mouth before you reach your teeth, and the place of your tongue when you say the letter S. The velar sounds are made with the back of the tongue pressed against the velum, like when you say the letter K.
Manners of articulation come in 6 main variants.
Let’s start with the consonants for now, as they are relatively easy to understand. The columns on the top place of articulation (where you make the sound) and the columns on the left represent the manner of articulation (how you make the sound). Places of articulation are arranged from front to back of the throat, and most of them are pretty self explanatory.
- Bilabial = between the lips
- Labiodental = between the top row of teeth and bottom lip
- Dental = tongue pressed against the teeth,
- Glottal = in the glottis (throat)
The sounds that might be harder to get are the alveolar sounds, which are made on the alveolar ridge, which is just the ridge that sticks out in the top of your mouth before you reach your teeth, and the place of your tongue when you say the letter S. The velar sounds are made with the back of the tongue pressed against the velum, like when you say the letter K.
Manners of articulation come in 6 main variants.
- Plosives (also called stops) are made by blocking up air and then suddenly releasing it. Examples: Party, about or card.
- Nasals are made by diverting airflow into your nasal cavity (humming is a nasal sound!). Examples: Mountain or nice.
- Trills are made by flapping around some loose part of your mouth, like rolling an r in Spanish. Taps and Flaps are like trills but only flapped/tapped once. Examples: The word carro (trill) or reloj (tap) when pronounced in Spanish. (Unfortunately, English does not have an example of either.)
- Fricatives are made by forcing air through a narrow gap. Examples: Fever or sneeze.
- Approximants are where two parts of your mouth almost touch but not quite, and are kind of more about mouth shape. If a sound is lateral, it is made by pushing air around the tongue. Examples: Water or red.
- Affricates are combinations of a plosive and then a fricative in the same place of articulation. Examples: Cheese or juice.
Sounds on the left side of their boxes are “unvoiced”, meaning you do not engage your vocal cords, and sounds on the right are “voiced”, meaning you do.
As for vowels, they are quite a bit more complex, since they’re more of a gradient than a strict set of sounds. The vertical axis represents how much your tongue lets air through, and the horizontal axis represents where in the mouth the sound is made. Sounds to the left of dots are made with unrounded lips and sounds to the right of dots are rounded. www.ipachart.com is pretty useful for listening to how different symbols sound, and which ones you can already pronounce.
If you know other languages like Spanish or French, you can also look at IPA charts for those languages and compare them to the full IPA chart to become more familiar with the symbols. A nice way to keep a few key reference points is by knowing that the letters A, E, I, O, and U make the same sounds in the IPA as they do in Spanish or other romance languages, such as French or Italian.
As for which ones to select for your language, I would pick any sounds you can already pronounce, or sounds you can learn to pronounce without trouble.
Not sure where to begin? Here are a few types you should include to get you started:
- Every known language contains plosives, so it's almost mandatory that you have some too, and generally speaking, the unvoiced /p/, /t/, and /k/, are quite common and a good starting point. You can branch out into others later on.
- Most also contain fricatives and nasals too. To keep your language balanced, I would recommend you pick roughly as many fricatives as plosives.
- You can also put 1-4 nasals. /m/ and /n/ are the most common nasal sounds and are great choices to include in a conlang.
- One more sound to consider is a rhotic of some sort, which are defined as “r like sounds” (alveolar trills or flaps, retroflex taps, uvular trills, voiced uvular fricatives, and labiodental, alveolar or retroflex approximants).
Step 2: Spelling and Root Word
Congratulations! Now you have all the tools you’ll need to start building your language. But before you go any further, it’d be nice to have a way to actually write your language down. (Fair warning: making a unique writing system is INCREDIBLY hard, so if you really want to make one, save it for later!) For now, let’s just work within the confines of the regular alphabet. Symbols /p/, /t/, /k/, /b/, /d/, /g/, /h/, /s/, /z/, /f/, /v/, /m/, and /n/ usually make the same sound as the letters do in English. For sounds like /ſ/, /ζ/, and /tſ/, which in English would be spelled as a combination of letters, you can choose to write them one of two ways. You can either use digraphs (the most common ones to use for those sounds would be sh, zh, and ch respectively) or diacritics (š, ž, and č could all work in this case).
Most digraphs are merely similar sounding consonants with the letter “h” after them to make some sort of phonological change.
With diacritics, it is similar to using digraphs in that a mark above the letter signifies a phonological change. For example, the caron (the little mark in š) could mark a shift from alveolar to postalveolar, or maybe a dot above t (ṫ) marks a shift from a plosive to a fricative. I tend to prefer diacritics over digraphs, but either one will work as a great addition to your language.
Another advantage of diacritics is that they can be used more easily with vowels. Vowel digraphs do exist (“oo” in English represents /u/, like in pool), but they could be accidentally mistaken for other sounds (oo in just about every other language represents a long /o/ sound), so many conlangers use diacritics above vowels to represent different kinds of vowels. Since vowels are more fluid, diacritic assignments can be pretty arbitrary and not a specific phonological change, more of just a way of adding more letters.
Congratulations! Now you have all the tools you’ll need to start building your language. But before you go any further, it’d be nice to have a way to actually write your language down. (Fair warning: making a unique writing system is INCREDIBLY hard, so if you really want to make one, save it for later!) For now, let’s just work within the confines of the regular alphabet. Symbols /p/, /t/, /k/, /b/, /d/, /g/, /h/, /s/, /z/, /f/, /v/, /m/, and /n/ usually make the same sound as the letters do in English. For sounds like /ſ/, /ζ/, and /tſ/, which in English would be spelled as a combination of letters, you can choose to write them one of two ways. You can either use digraphs (the most common ones to use for those sounds would be sh, zh, and ch respectively) or diacritics (š, ž, and č could all work in this case).
Most digraphs are merely similar sounding consonants with the letter “h” after them to make some sort of phonological change.
- In “sh”, the s marks the sound /s/, and the h makes it shift from alveolar to postalveolar.
- In the digraph th, representing the sound /θ/, the t marks the plosive sound /t/, and the h marks the shift from a plosive to a fricative.
- In “dh” (representing /ð/), d represents the plosive and h represents the fricative, just like in “th”.
With diacritics, it is similar to using digraphs in that a mark above the letter signifies a phonological change. For example, the caron (the little mark in š) could mark a shift from alveolar to postalveolar, or maybe a dot above t (ṫ) marks a shift from a plosive to a fricative. I tend to prefer diacritics over digraphs, but either one will work as a great addition to your language.
Another advantage of diacritics is that they can be used more easily with vowels. Vowel digraphs do exist (“oo” in English represents /u/, like in pool), but they could be accidentally mistaken for other sounds (oo in just about every other language represents a long /o/ sound), so many conlangers use diacritics above vowels to represent different kinds of vowels. Since vowels are more fluid, diacritic assignments can be pretty arbitrary and not a specific phonological change, more of just a way of adding more letters.
Et voila, you now have a working way to write down your language! Now it’s time to put it to use, so let’s start by building a few very basic root words. Try to make these words the most basic concepts you can think of. Words representing “person”, “possession”, “emotion”, or “interaction” are great starters, as they are used in everyday conversation, and can be modified to make nouns like “object” or “feeling” as well as verbs like “to have” or “to talk”. Your goal here is to make a general purpose toolbox for making more complex words later on. (For example, your word for “carpenter” could be a cross between “person” or “worker”, and “tree” or “wood”.)
Now that you have a list of sounds you can use, it's time to make some very basic morphemes (smallest unit of inseparable meaning). Try to make words that are so simple you cannot divide them into smaller parts (house, person, love, death, earth, water, life, etc.). Try not to make an extensive list; 100-200 words is a good goal. Here is a good time to decide if you are making:
- An International Auxiliary Language (a language meant for ease of communication between people who speak different languages, and includes morphemes from commonly spoken languages around the world).
- Or an artlang (almost completely your design).
Don't stress over the benefits or drawbacks of either. In fact, having a mixture of both is not only acceptable, but encouraged!
For an IAL, try to diversify your language selection. Don’t put all your eggs in the same basket! Experiment with combinations. Try mixing in some Chinese with some Polish, and a dash of Finnish or Portuguese.
If you’re making an artlang, the makeup of each word is completely up to you! Try to keep things like syllable structure, phonotactics, and common sounds consistent, however, so that your language develops its own unique sound.
Step 3: Syntax
Wake her morning him up out get the his and bed. Doesn’t make much sense, right? Well, that’s what language is without syntax: random words compiled together with no rhyme or reason, and no way to communicate complex ideas in an effective manner! Unfortunately, syntax is as difficult to make as it is important, so just like with everything else in this guide, let's simplify.
The first step in devising a syntax system is picking a basic word order. Sentences are composed of 3 basic parts: The Subject, Verb, and Object. Here are a couple examples:
Wake her morning him up out get the his and bed. Doesn’t make much sense, right? Well, that’s what language is without syntax: random words compiled together with no rhyme or reason, and no way to communicate complex ideas in an effective manner! Unfortunately, syntax is as difficult to make as it is important, so just like with everything else in this guide, let's simplify.
The first step in devising a syntax system is picking a basic word order. Sentences are composed of 3 basic parts: The Subject, Verb, and Object. Here are a couple examples:
- SOV - The most common word order, employed by about half the world’s languages. “Sam oranges ate.” (Hindi, Japanese)
- SVO - The second most common word order, and the one our language uses. “Sam ate oranges.” (Mandarin, English)
- VSO - The third most common word order, used by about a fifteenth of the world’s languages. “Ate Sam oranges” (Irish, Tagalog)
- The rest of the word orders (VOS, OVS, OSV) are very uncommon, only used by >2% of the world's languages.
- About 13% of languages have a word order that’s more free form, and use other means to discern who does what to who.
And yes, your language can totally have multiple word orders, especially when it comes to different types of sentences. If you’re picking multiple word orders, the first one you choose should reflect the structure of a basic declarative sentence (a sentence that makes a statement, i. e. Pizza is delicious!). For sentences with a direct and indirect object, usually the indirect object precedes the direct object with their prepositions attached.
Another complex topic you will need to understand is morphosyntactic alignment. Morphosyntactic alignment is basically how things in a sentence are grouped together. To understand that, you’ll need to know the difference between transitive and intransitive sentences (thankfully, it’s pretty simple). Intransitive sentences have only one thing (noun or pronoun) in the sentence, and that thing does a single action. For example:
The dog slept.
In this sentence, the dog is the only thing in the sentence. The one thing in an intransitive sentence is called the “sole”. Transitive sentences, however, contain two things: an agent, who completes an action, and a patient, of whom the action is acted upon. For example:
He listened to her.
In this sentence, “He” would be the agent, and “her” would be the patient. Morphosyntactic alignment is basically how we group the sole, agent, and patient of a sentence. There are two main morphosyntactic alignments, Nominative-Accusative and Ergative-Absolutive, which can be shown neatly on these little charts.
About 75% of languages use a nominative accusative system (English is one of them!). In a nominative-accusative system, the sole and agent are treated the same (nominative case), while the patient is treated differently (accusative case). Marking can take the form of an affix, word order, or completely different word!
In English, marking is done by word order and occasionally different words for pronouns. For example:
She (NOMINATIVE) talked to him (ACCUSATIVE).
“She” is marked as nominative by coming at the beginning of the sentence and using the nominative form (the accusative form being “her”), and him is marked as accusative by coming at the end of the sentence and being in the accusative form (the nominative form being “he”).
Ergative-Absolutive alignment, however, treats the sole and patient the same (absolutive case) and the agent differently (ergative case). About 25% of languages use this alignment, including Basque and Hindi. If English was an ergative-absolutive language, we would get sentences like:
He sleeps.
Her (ERGATIVE) woke he (ABSOLUTIVE) up.
In English, marking is done by word order and occasionally different words for pronouns. For example:
She (NOMINATIVE) talked to him (ACCUSATIVE).
“She” is marked as nominative by coming at the beginning of the sentence and using the nominative form (the accusative form being “her”), and him is marked as accusative by coming at the end of the sentence and being in the accusative form (the nominative form being “he”).
Ergative-Absolutive alignment, however, treats the sole and patient the same (absolutive case) and the agent differently (ergative case). About 25% of languages use this alignment, including Basque and Hindi. If English was an ergative-absolutive language, we would get sentences like:
He sleeps.
Her (ERGATIVE) woke he (ABSOLUTIVE) up.
Finally, there are two more questions you should ask yourself when it comes to syntax:
- Do adjectives come before or after the nouns they describe? English has the former, but many languages such as Spanish have the latter, and to many, it seems logical to introduce what you are describing before you describe it. So consider if you want to say “the nice woman” or “the woman nice”.
- Do you want to have prepositions or postpositions, or neither (more on this in part 2)? An example of this could be something like “On top of the box (preposition)” vs. saying something along the lines of “The box on top of (postposition)”. Of course, postpositions sound really strange in English because we aren’t used to them, but other languages like Japanese and Korean use them, so it’s really up to you.
Conclusion
If you are reading this, congratulations on making it this far! Now you have the experience you need to start constructing your first speakable language: an inventory of sounds, words, and rules that will shape the unique dialect of your own creation. If you are ready to take your conlang to the next level and refine it into something even more beautiful, tune in next year for Part 2 of the conlang guide. Now take a break–you deserve it!