Grammar in Foreign Languages / Useful Notes

Unlike what you would see in many works of fiction, languages of the real world can work in wildly different ways, enough to make them sound like Starfish Language to a non-native. A common remark of people studying linguistics is to scratch their heads and remark, "Go home, language, you are drunk." In fact, for every property that has ever been proposed as a "universal" characteristic of human language, there is at least one known non-artificial human language that doesn't have it, or has its exact opposite.

Western audiences and authors generally find the Indo-European language family the most familiar in terms of grammar and vocabulary. This family includes most (but not all) of the languages spoken in modern Europe (already quite diverse; compare Russian to English to Italian) but also roughly half of the many languages spoken in India and what used to be called the "Near East" (Turkey, Persia, etc). And Indo-European is only one of dozens of such families. Wikipedia has more details.

Real human languages very often differ from what Benjamin Whorf has called "Standard Average European" in that they can:

Lack articles such as a, an, or the, such as Russian and Latin (IE), and Japanese and Chinese (non-IE).
- Have definite articles but no indefinite articles, such as Irish and Icelandic (both IE), Esperanto (a Conlang based mostly on IE languages), and (all forms of) Arabic (non-IE)
- Have indefinite articles, but express definite forms with a suffix (Scandinavian languages and Romanian, IE)
- Have finicky rules about when things can be definite or indefinite (Literary Arabic: not "a leader of the community," but rather "one leader among the leaders of the community")^note; even closely related languages such as English and German sometimes use inverse rules when it comes to abstracts, for example. ^note
- Somewhat related as well is being finicky about whether definite articles or possessives should be used in describing a noun. For example, in Spanish, any verb phrase that is constructed where what one would perceive as the 'object' in English is actually the subject and vice versa (e.g. "me duele el corazón" ("My heart hurts") or "se me perdió la bolsa" ("I left my purse behind")), you would never hear anyone worth their salt use a possessive, because it is already implied by the indirect object pronoun. And of course, there are many common constructions like these, like saying "se me olvidó" ("I forgot"), "se me cayó" ("It fell" in an unexpected way), "se me derramó" ("It spilled"), and so on.
- Have many more articles than English. German articles change according to gender, number, and case of the noun, resulting in 16 possible combinations for the definitive article (although those are only expressed through 6 forms^noteder, die, das, des, dem, den).
Have no direct or single equivalent of verbs like 'to be', 'to have', or 'to do' which are kind of a defining feature of IE languages. It's often not just non-Indo-European languages. Irish, the Ibero-Romance languages (Spanish, Portuguese, Galician, etc.) as well as Catalan (Gallo-Romance) have two copulas ('be') (one of the Romance ones usually deriving from the Latin word for "to stand"). Irish and Russian have no auxiliary verb "have". ^notehave as in "Have you seen my new boots?" not as in, "I have a new pair of boots."
- Arabic, meanwhile, has both "to be" and "to have" (in the possessive sense), but uses them far less frequently than English does. "To be" is almost always omitted in the present tense; you would say "I Egyptian" rather than "I am Egyptian". The equivalent of "to have" is almost never used for normal possession, because it implies not just possession, but sovereignty. You would say "to/at me there is an umbrella," not "I have an umbrella."
- The same is mostly true of Russian, where the usual wording is "at me is an umbrella" and the verb "have", imet' is pretty much only used in formal speech (in colloquial speech it can be used as an euphemism for "fuck", leading to many puns).
- Spanish, in particular, also has an auxiliary verb in haber, which is sort of a mixture of "to be", "to have", and "to exist". It's used in almost all 'perfect' verb forms (indicating an action happened right before another action) by conjugating it to whatever tense and placing the past participle of the action afterwards, taking the 'to have' meaning ("Ella había comido antes de ir al cine." ("She had eaten before going to the movies.")). However, it's not usable as "to be" as in "I am from Texas.", but only as qualifying existence ("Hay una granja en la colina." ("There is a farm on the hill.")) Basically, it's a weirdo verb.
- In Polish, "To have" ("mieć") is never used as an auxilliary verb - Polish only has one past tense (with two aspects, if you want to be technical) and the vestigial plusquamperfectum uses "to be".
Do not mark nouns for number (Japanese, Chinese), or, alternatively, have more number markers than simply singular and plural. Many languages have separate dual or even trial ('three') numbers. There is even at least one language that has marks for zero (I have no cookies), fractional (I have half of a cookie), singular (I have one cookie), dual (I have two cookies), paucal (I have a few cookies), and large-scale plural (I have lots of cookies)! Most Indo-European languages have lost their duals; Sanskrit, Ancient Greek, and Old Church Slavonic had them, and there are still traces of them in some of the Balto-Slavic languages (usually in a unique declension for the number two, and different noun forms used with certain numbers). English has some leftovers from a former dual/plural distinction in the dual-only words both, either, neither, and between, which correspond to all, any, none, and among when referring to more than two objects. Latin also had one, which survived in the irregular declension of the word "duo", while Slovene still makes full use of it. Old English possessed the vestiges of a dual, but only in the pronouns. Come Middle English, this dual number was gone.
Have a more limited set of cardinal numbers — the so-called "one-two-many" phenomenon, although some languages may hit "many" at a point other than three. Note that this does not necessarily prevent accurate counting above "many"; it may just change the nomenclature. The Trolls of the Discworld, for instance, have a cardinality based on powers of 4: "one" (1), "two" (2), "three" (3), "many" (4) and "lots" (16), which can then be combined to express other quantities (like English does for concepts like "twenty-one" and "one hundred fifty-two"). Then again, a culture that is truly innumerate may not be able to distinguish between different quantities of "many".
- Conversely, linguistic evidence suggests that many languages started out with "one-two-many" cardinals before gaining more terms for numbers above two; one of the telling pieces of such evidence is that the first two ordinal numbers in most languages ("first" and "second", in English) are not related to their corresponding cardinals ("one" and "two"), whereas ordinals for three and above ("third", "fourth", etc.) are clearly constructed from their cardinals. An alien language might well go further into the ordinals before one encounters the first ordinal derived from a cardinal, suggesting a larger range of early numeracy than humanity generally demonstrated.
- You may think a race with an inherent grasp of mathematical concepts might never derive ordinals from cardinals, but you can't just bust out a new word whenever you need a high enough number; at some point you're gonna have to start building your numbers on earlier numbers (say, twenty-one; that way you can also round 3104393 to three million). That said, an alien language might follow a completely different repeating pattern in ordinals and cardinals.
Have nouns with grammatical gender. French has two (masculine and feminine), German has three (masculine, feminine, neuter), and some languages assign "gender" according to whether the topic of the subject is visible, known to be near, or far away. Some languages have a simple animate vs. inanimate. Some confusingly combine these (e.g. Arabic, which arbitrarily divides non-human objects into masculine and feminine, and proceeds to ignore that division by making all inanimate plurals "singular feminine"; questionable implications aside, it's really confusing—confusing enough that many colloquial varieties have shifted to giving inanimate plurals plural agreement (verbs in most colloquial varieties have lost the unique feminine conjugation in the plural)). Other languages differentiates gender by properties of the noun, at which point linguists generally stop calling it "gender" and instead use the term "noun class"; Swahili has a different "gender" (noun class) for people, animals, tools, liquids and so on.^noteThis is a characteristic of the Bantu language family more generally—Swahili's distant relatives like isiZulu in South Africa and Lingala in the Congo Basin also have it. Or alternatively, are more gender-neutral than English, like the Uralic Languages. Imagine having "he" and "she" be the same word, as well as "him" and "her." It's also possible for languages not to distinguish gender or animacy in their pronouns. Basically, everything is "it", whether it's a man, a woman, a dog or a bit of navel lint.
- On the subject of animacy, Spanish distinguishes sentient vs non sentient direct objects by putting the word a before the object: golpeo la mesa (I hit the table) vs golpeo a la persona (I hit the person). Even English reserves the possessive marker (-'s) for animate nouns: The man's legs sounds better to a native English speaker than the legs of the man, although both make sense.
There can also be grammatical gender for numbers. In Hebrew, there is a male and female form (the latter is the one commonly used for plain numbers - probably because the male form is often a syllable longer). Sometimes, it's worse, when there are further divisions due to the object type. There is a story about a Nivkh child who had trouble subtracting five buttons from thirty and adding six trees to seven - because the shape of the buttons and the size of the trees weren't specified.
- Portuguese, Spanish and other Romance Languages have a variation on this: they can mark some numbers in both gender and number, but not all of them and not always. For Portuguese, the rule is you can one, two and numbers ending in them (such as one hundred and two) in gender^note, but not eleven or twelve, nor their derivatives, and only when denoting quantities of specific things, otherwise the male is standard. In number you can mark any number that doesn't end with "S" or "Z", but this is only for denoting quantities of numbers^note.
Have numbers force a specific inflection for the nouns they modify depending on what the number's final digit was, leading to a system like "21 system, 22 system's, 25 systems'".
- For example, in Russian:
  - If a number's final digit is 1 (such as 1, 21, 101, etc.) the nominative singular is always used.
  - If a number's final digit is 2, 3, or 4, the noun is in the genitive singular form.
  - If a number's final digit is 5, 6, 7, 8, 9, or 0, the noun is in the genitive plural form.
  - If the last two digits of the number are 11, 12, 13, or 14, the noun turns into the genitive plural anyway, overriding both the "ends in 1" and the "ends in 2, 3, or 4" rules.
- In Polish, it's jeden śliczny kotek (one cute kitty, nominative singular) and dwadzieścia jeden ślicznych kotków (twenty one of the same). Numbers ending with two, three or four follow the pattern dwa/trzy/cztery kotki (two/three/four kitties, with the noun in nominative plural) and numbers with any other ending follow the other pattern (dwadzieścia pięć ślicznych kotków - the noun in genitive plural). Except, of course, for twelve, thirteen and fourteen, which follow the pattern with genitive, see here.
Mark verbs for categories that English either doesn't have or marks periphrastically, such as voice, aspect, mood, and so on. Or don't mark verbs for categories that English does; Mandarin Chinese has no tense, and conveys temporal information through aspect, instead.
Differentiate between the inclusive and exclusive 'we'. Compare the English, "We are at a disagreement" to "We do not like you." The inclusive includes the person being addressed, while the exclusive does not.
Have a different concept of "word" than what you expect. There is no agreement among linguists on what constitutes a "word", or even on whether there is a universal concept of "word" that can be applied to all languages. Again, Japanese provides an example — are the particles (wa, ga, o, etc) part of the word or separate words themselves? Most linguists say they're separate, but there's no shortage of transliterations that don't have a space there. (Japanese itself avoids the issue by not having spaces between words at all.)
Are ergative-absolutive instead of nominative-accusative. Take two similar sentences that differ in verb transitivity (such as 'He slept.' and 'She ate them.'). A nominative-accusative language (like English) case-marks the subjects 'he' and 'she' the same in both sentences (that is, as 'he'/'she', the nominative case, instead of as 'him'/'her', the accusative case) and case-marks the object 'them' (perhaps some apples?) in the accusative (as opposed to in the nominative 'they'). In an ergative-absolutive language, the subject of the intransitive sentence 'he' would be case-marked the same as the object of the transitive sentence 'them' — in the absolutive case. The ergative case only shows up marking the subject of the transitive sentence 'she'. Total ergativity is extremely uncommon, with Basque, a language isolate spoken in Spain and France, being one of the few languages to be almost completely ergative. Most languages considered ergative have split-ergativity instead, which means they only behave like an ergative-absolutive language in some contexts, and use another alignment (usually nominative-accusative, as in English) in others. Several Indo-Iranian languages such as Kurdish and Hindi are split-ergative. They appear to have borrowed this feature from neighbouring languages like the Dravidian languages, the Caucasian languages, etc.
- There are a lot of different kinds of morphosyntactic alignment, besides nominative-accusative and ergative-absolutive. Some languages are transitive, marking both the subject and object of a transitive sentence the same, but the subject of an intransitive sentence differently. Some are tripartite (marking the subject of a transitive sentence, the subject of an intransitive sentence, and the object of a transitive sentence all differently). Some are various kinds of active-stative (marking subject case based on whether or not the subject actively does something, so case marking is dependent on the meaning of the verb rather than grammar), and then there's "Austronesian alignment", which is, well, very confusing.
- Then there is the fun case of finished versus unfinished action. For example the distinction between passé composé and imparfait in French. Another such case is the object cases in Finnic languages. An example from Finnish: "Söin kalaa" (I ate some fish) vs. "söin kalan" (I ate a whole fish). The idea is similar as in French but it's specifically about transitive sentences and it's marked with the object rather than the verb.
Have wildly different syntax (word order). English generally places the subject of a sentence first, the verb second, and the object last, a very common word order. However, in just as many languages, the subject is placed first, the object second, and the verb last. A minority of languages even do things like place the verb or the object first, the subject last, or any other possible combination. Some languages, usually those that are highly inflected, don't even have a hard and fast word order at all. Latin, for instance, generally prefers SOV outside of poetry, but is so inflected that the word order can be changed without changing the meaning of the sentence. The old forms of Semitic languages (like Classical Arabic and Biblical Hebrew) historically preferred VSO, but left SVO as an option because of their inflection—the latter of which became dominant in the contemporary colloquial forms. German puts the verb in the second position of declarative statements, at the beginning of questions (just like English), and at the end of subordinate clauses. And Japanese... Japanese word order has its own PAGE on The Other Wiki.
Then there's the question of whether to put adjectives before or after the words that they modify, where to put determiners, what types of clauses or sentences change word order, how to construct relative clauses, etc.
Are not nearly-isolating languages like English, where word use is determined by position, and there are lots of particles — small words with purely grammatical functions (like English prepositions). Some languages, like Japanese and Turkish, are agglutinative, where word use and other such markers are affixes that combine in a string. Some languages, like Latin and its descendants, are fusional, where word use and other morphemes are marked by affixes that are all mutually exclusive (so there's one affix in Latin where Turkish might have a string of three or four, but you need a completely different affix in Latin for a small change in meaning, while Turkish can just switch out one of its affixes). Agglutinative languages are rather famous for their ability to cram very large amounts of information onto single words. For example, in Hungarian, the common toast "Egészségünkre!" is literally "To our health!"; a phrase which takes three words to say in English, but in Hungarian, one word does the job. Some languages really take the ball and run with it — in Inuit, "he said he wouldn't be able to arrive first" is "tikitqaagminaitnigaa," while in Yaghan, "the look shared by two people too shy to do anything about it" is "mamihlapinatapai." It gets even worse when you get to polysynthetic languages, where several distinct words get mashed together: archaic Ainu "usaopuspe aejajkotujmasiramsujpa" means "I keep swaying my heart afar and toward myself over various rumors."
Or perhaps they're more isolating than English is. Plurals and past tense forms may be expressed using distinct words that in some cases can be used alone: "did walk" instead of "walked", with "did" alone as a possible answer to a question. Chinese, for instance, has one morpheme per syllable and close to one morpheme per word.
Have adjectives that act like verbs instead of or along with acting like nouns (kind of). For example, some Japanese adjectives can be conjugated just like verbs — shirokunakatta ie = the house that was not white (white-NEG.PAST house). Sometimes this situation is described as "the language has no adjectives," which confuses the uninitiated — what is meant is not that the language doesn't have words like "red" or "large," but rather that words like that follow the same rules as verbs.
- The Wolof language of Senegal conjugates pronouns. Maa ngi dem means "I am going" or "I go." Dinaa dem means "I will go [soon]." In this case, dem is the verb (go), and cannot be changed. Maa ngi and dinaa are both pronouns.
- The Conlang Lojban, which is built on logic, only has three main parts of speech: particles, pronouns and verbs. No nouns, adjectives or adverbs. A noun is built with a construction equivalent to "someone/something that [verb]s" (like the English suffix "-er"), and adjectives/adverbs with a construction like "do [main verb] in a [secondary verb]-like manner". (Of course, many verbs do correspond exactly to English nouns or adjectives: "is a house", "is large"...) Also, although in many cases they're optional, Lojban has so-called "vocalized parentheses": particles that mark where a clause/phrase/something starts and ends, thus preventing most kinds of Ambiguous Syntax.
- Hungarian in present tense does not use existential verbs when expressing that <subject> is <adjective> (but only in third person singular or plural; first and second person uses the proper conjugation of the existential verb and drops the subject instead). The adjective is not conjugated like in Japanese though, it only gets a plural marker if the subject is plural. E.g.: "The ball is red" becomes "A labda piros", but "The balls are red" will be "A labdák pirosak".
Require the use of classifiers when counting nouns. A common characteristic of East and South East Asian languages. There are classifiers for animate and inanimate nouns, for roundish, stick-like or sheet-like objects, for people, for things that go in pairs and for everything else under the sky.
Have prepositions that can be used independently as verbs, or rather, have verbal grammar such that subordinate verb phrases are used when English would use prepositional phrases. In such a language, one word may serve as the verb "go" and the preposition "toward".
Use noun cases to convey the same meaning as English prepositions. In Finnish, for instance, there are fifteen distinct noun cases (kind of makes the three in English look simple, doesn't it?) to express various different meanings, but the use of prepositions is severely limited. For example, "talo" means "house," but "talossa" means "in the house," "talolla" means "at the house," "taloksi" means "(transform) into a house," etc. Some languages have even more. (Hungarian has at least eighteen cases, and that's without counting the rarely used ones. A fellow Uralic language, Komi, has over twenty as well.)
- Languages with noun cases also avoid Ambiguous Syntax of the "flying purple people eater" sort. The main noun in a group like this will be in nominative case, along with its adjectives, while all the other nouns (and their adjectives) will be in other cases, clearing the syntax up. For example, in Polish, a creature that eats flying purple people would be pożeracz fruwających fioletowych ludzi, while a purple flying creature that eats people would be fioletowy fruwający pożeracz ludzi. And a purple creature that eats flying people will be fioletowy pożeracz fruwających ludzi.
Differentiate between alienable and inalienable possession: "my wrist" is "wrist of me", but "my watch" is "watch on me".
Have something other than two degrees of demonstratives — English has just this and that (but it used to have yon[der] as a third, and the other is commonly used as a third but decidedly less standard), Japanese has three (kore, sore, are), some languages have one, some have as many as five. Alaskan Yup'ik has thirty. They are sorted by five layers of location, three layers of visibility and two layers of accessibility. So for example one demonstrative means "partially visible 'that,' near and accessible to the listener but not necessarily to the speaker." Another demonstrative means "completely visible 'that' which is above the speaker and inaccessible to him/her."
- German, by contrast, has only one used in common speech, dies-. Technically there is a second, jen-, cognate with English yon—and used just about as frequently.
Mark the relationship between speaker and audience (register), and occasionally also between speaker and subject, whether through pronouns or verb forms or sentence markers. Most Indo-European languages have this, actually; for example, in French there's 'tu' (informal) and 'vous' (formal). English is one of the few IE languages that doesn't do this, although it used to and a few dialects still do. Some languages get very elaborate; Javanese marks for formal/informal, plain/polite, and humble/honorific, in any combination of the three (though formal/informal are pretty similar). Korean has about seven degrees of politeness and formality, each of which also has a humble and an honorific form—though a few of them aren't used much anymore.
- Or just have a different world view on pronouns altogether. Vietnamese is often described as "having no universal pronoun". (Which is untrue, as it actually does have some.) In practice, this means that in most conversations, the language requires its speaker to choose a kinship word (let’s call it Kinship Term A) to refer to themselves where English would say “I”, and pick another one (Kinship Term B) for the listener, where English says “you”. Here’s where it gets interesting: When comes the other person’s turn to speak, the kinship words stick to the respective parties they represent, so now Term A becomes "you" and Term B becomes "I". What the address terms actually do is to convey the expected social relation between you and the other person. You don't stop being your mom's child just because it's your turn to speak. Confused yet? That is how an Anguished Declaration of Love by a man to a woman in Vietnamese could translate to “older brother love younger sister a lot”, and then the woman would reply “younger sister love older brother a lot too." Working out the I’s and you’s in Vietnamese can ask for (and reveal) a ridiculous amount of contextual info – the other person’s sex, age, your own sex and age, relationship between you and them if any, their attitude towards you, your attitude towards them... And that’s just for one-on-one convos. People’s first names can take on the role of pronouns; in fact, any noun can, under the right circumstances. Needless to say, this makes for a sociological minefield even for native speakers.
Have words that don't directly and perfectly translate into English. Sure, there can be some of the whole "showing culture through vocabulary" thing, but also more mundane instances — for example, English divides temperature into cold, cool, warm and hot, but other languages may have only two or three of those, or maybe more. On the other hand, German and Hebrew, among others, have a word for "the day after Tomorrow", which English lacks.
- Similarly, many non-English languages divide up colors differently from the Western standard "ROY G. BIV", with some having as few as just two basic colors (black and white)^note. Quite a few make no distinction at all between blue and green. On the other hand, some Asian languages have dozens if not hundreds of distinct color names. An author writing a race with a different visual range from humans (such as demihumans from Dungeons & Dragons, who frequently possess vision in the infrared range) may forget to create terms for colors humans can't see at all, not even "squant" or "octarine".
- Other languages may also have fundamentally different conceptual metaphors. For example, while in most languages the past is "behind" us and the future lies "in front" of us, in Chinese, Quechua, and Aymara it is the other way round.^noteRather like the Discworld Trolls idea. Rather than likening the passage of time to the ego's journey from the past toward the future these languages liken it to a movement of events in a queue — the events of the future are lined up behind the events that have already occurred (this metaphor is also present in English and other languages with words like "before" and "after", but it is only used to relate events to other events, when the ego is not involved).
A language might not have a general term for a group of objects or actions that English takes for granted. For example, an Australian aboriginal cannot say "twenty birds" referring to a group of ten sparrows and ten ostriches. For him it would be like adding rocks and dogs together. In Russian, there are no words meaning "bring" and "put" - you can only say that you carried or rolled something in, or that you laid or stood something in front of a person.
- The latter is actually a very important object of study in linguistics: verb-framing versus satellite-framing. Spanish (like all Romance languages) is a heavily verb-framing language: this means that the path of motion, but not the manner is usually expressed by the verb. You don't "run in" or "run out" in Spanish, but "enter" or "exit": if needed, you can specify the manner: "enter running" (entrar corriendo). English, by comparison, is typically satellite-framing (like all Germanic languages). Russian and other Slavic languages are even more satellite-framing than English, hence the lack of the direct counterpart to "bring". Notice how this differs wildly between related languages (Romance, Germanic and Slavic are all Indo-European, and the ancestor to Romance, Latin, was actually satellite-framing).
Lack relative constructions ("the one that does X" etc.), and have to substitute adjective phrases ("the X-doing one"), or have correlatives: "This is the man who my wife has been sleeping with him!" Or on the other hand, lack adjectival phrases and have to use relative constructions instead. English has way more adjectival phrases than the Romance languages, as many of them can only be translated with relative constructions.
Treat relative clauses like adjectives. For example, in Mandarin Chinese, using the attributive particle de, one can just as easily say "red de car" （红色的车/紅色的車） as "drives down the street de car," (路上开着的车/路上開著的車). The former would simply be "red car," but the latter would have to be translated as "the car driving down the street."
Are topic-prominent instead of subject-promotional (Japanese). In English, the subject is understood to be the topic of the sentence (which the passive voice helps to facilitate). In Japanese, topic and subject do not have to be the same.
Have no element in a sentence that corresponds straightforwardly to what Europeans would call the "subject." The topic-prominent Japanese -wa is a good example, as are dozens of academic papers in Linguistics debating whether sentences in Tagalog (the most common language of the Philippines) can be properly said to have subjects or not. (Short version: the properties that a subject has in English can often be split up between two noun phrases, the "topic" and the "agent", in other languages.)
Is written using logograms (Chinese)^noteEach symbol stands for a word or a morpheme, as in mean-ing-ful, abjads (Arabic, Hebrew)^noteVowels are not written, syllabaries (Inuktitut)^noteEach symbol represents a syllable, abugida (the languages of India and Ethiopia)^noteVowels are written as attachments to consonants, or a hodgepodge of everything (ancient Egyptian and modern Japanese), instead of an alphabetic writing system. And not all writing systems include the concepts of upper and lower case^noteMost languages., cursive writing^noteFor instance, all Arabic writing is cursive, while in Hebrew the "cursive" script is non-connecting and/or punctuation, and if they have them, they may not use them the same way.^noteFor example, German captializes all Nouns, proper or not, and a few other languages only captialize the first letter in a work's title (e.g. A wrinkle in time instead of A Wrinkle in Time
- Some languages (such as Serbian, which uses Latin and Cyrillic) have two or more writing systems that are all considered official, but not used alongside each other (like how Japanese uses hiragana, katakana, and kanji), making native speakers choose their preferred writing systems.
- Korean Hangul is a very fascinating one: it's a syllabary where each syllable character is a combination of the characters for the sounds it contains and each sound character is actually a "code" describing that character phonetically, making it both a syllabary and an alphabet. Sounds complicated but it's very logical in use.
- But wait, there's more! Even good old Latin script, in the process of adjusting it to all the sounds Latin doesn't have (see below) has aquired diacritical signs that modify the letters. And yes, they are important. If, for example, you receive an SMS from a Polish friend containing the word "maz", it may take you a while to work out whether she meant "maź" (goo), "maż" (imperative, doodle! or smear!), or perhaps "mąż" (husband).
Use different methods for dividing words other than spaces. Many, such as Japanese and Chinese, have no divisions at all. Other options include interpuncts (Classical Latin), special characters at the endings of words (Hebrew), or even elevating the first character in each new word (Persian). German is also famous for not having spaces in its noun compounds — though in reality, these compounds are grammatically more or less the same as English phrases like magical girl anime fan; the main difference is orthography (where you put spaces in writing), not grammar proper.
Possess writing directionalities different from the most common left-to-right and top-to-bottom, such as right-to-left and top-to-bottom (Arabic, Hebrew), left-to-right in vertical lines that run from top to bottom first (Mongolian, Uyghur), or even right-to-left in vertical lines (Chinese, Japanese). Beyond that would be changing direction with each line (Ancient Greek, Archaic Latin), which while common in antiquity is used by no (natural) modern language. Then there are languages that can be written in multiple ways, or are leaning more towards left-to-right and top-to-bottom as a result of western influence.
Follow a different syllabic stress pattern than English. A case in point: when faced with an unfamiliar word of more than two syllables, English speakers tend to stress the next-to-last syllable, with a secondary stress on the second syllable prior to that, if the word is long enough. Other languages may prefer other stress patterns. Word stress patterns are particularly in-ground habits, and it is sometimes quite difficult to adapt to a different language's "defaults"; writers creating a language will rarely choose stress patterns they find difficult or "unnatural".
Use pitch and changes thereof as elements of meaning in words. While Mandarin Chinese is the most famous example, numerous African languages also possess this property, where changing the pitch at which you pronounce a set of phonemes can completely change the meaning of those phonemes.
Form compound nouns differently.
- Most languages put the base noun at the back, but there are languages which put it at the front. As an example, control CENTER would be translated as PUSAT kawalan in Malay language.
- Many languages can't even have compound nouns at all the way English does (that is, just by stringing nouns together). They either have to inflect the modifier nouns to distinguish them from the base noun, turn the modifier nouns into adjectives, or to form elaborate phrases to convey the meaning. The same example, "control center", would be rendered into Russian as "центр управления" (literally "center of control") – not "управление центр" or "центр управление".
- And in Polish, this depends on whether the phrase is ad hoc or fixed in language as a proper name: forest elephant (as in, a species) would be "słoń leśny", but that green elephant over there (as in the specific animal we're seeing right now) would be "ten zielony słoń, o tam".
Have idioms and allusions that make no sense to a non-native speaker. Even languages that are closely related to English have turns of phrase that are completely incomprehensible without a native to explain their use, such as the French avoir les dents longues ("to have long teeth", meaning "to be ambitious") or the German Ich werde dir die Daumen drücken ("I'll squeeze my thumbs for you", meaning "I wish you luck"). Languages of vastly different derivation, evolving in a wildly foreign cultural matrix, can (and do!) have idioms that make even less sense to the outsider — and nonhuman/alien idioms may be utterly impenetrable even with native help.
Similarly, has a different concept of what constitutes "blasphemous", "obscene" or "offensive" language. Different body parts, functions or gestures — or none at all — may be offensive to native speakers; other obscenities will be culturally-based, derived from the religious, social and/or political matrix in which the language evolved. This can be seen even between English-speaking cultures — it was noted once that Catholics tended toward religious-based oaths, while Protestants swore by bodily functions. And Americans generally have no idea why some Brits consider "bloody" such an offensive adjective that in the Victorian era it was frequently replaced with "ruddy", and its use still gets reprimands in some quarters today. Further, a dialect may encode a language's obscenities into unrecognizability — see the "Cockney Rhyming Slang" section of the British English page. And some obscenities may well be fossils — words or usages which carry offense only because "everybody knows they're dirty", despite the reason for this common knowledge being long forgotten. In more extreme cases, entire tenses, moods or categories may be offensive, perhaps under complex rules governing time, place and speaker.
- While most languages have words that are considered obscene in any and every situation (for example, it is impossible to use the f-word "politely" in English), swearing in other languages is a much more context dependent matter. In Japanese, for example, registers of politeness is encoded directly into the grammar and failure to employ the polite verb conjugation when speaking to a social superior is occasion for great offence; however, using the exactly same sentence when speaking to a social inferior could be construed as tactless, but not technically rude.
Some languages have rare sounds and unusual phonotactics, which can make them sound like the The Unpronouncable. Many world languages do not like big clusters of either consonants or vowels. A maximum of about three consonants per vowel, and no more than three vowels in a row is usual. Russian can be really dickish with odd sound consonants, especially with prepositions. Can you say kvrachu or vsmolensk or vtorom or vpragu or sdrugimi or vchera?^note And even Russians shake their heads at Armenians.^note
And above all, do not have only and all of the sounds that are found in English. The pronunciation of even closely related languages, like Dutch and German, can only be approximated by English sounds, let alone more distant languages, and vice versa: this is of course where foreign accents come from. Even a lot of conlangs still use English's horribly complicated tense/lax vowel system (yet many claim to have five vowels, while English generally has 12 or more), and some of the worse-done relexes and such employ English orthographic conventions as well — writing reed or rede when the speaker says /r\i:d/. And few if any conlangs employ more consonants than English possesses (which do exist — Xhosa and related African languages, for instance, have three entire groups of click-based consonants which have no counterparts in Indo-European tongues, and the glottal stop — which while present in English is generally not even noticed as a separate "sound" — is a common element in many others).
Many languages have sounds which English can closely approximate, but does not replicate, and some of these words can form minimal pairs, the only difference between them being one sound difference. The most infamous of these is are the German nackt (naked) and Nacht (night). Also somewhat infamous are the Russian syn and syr (son and cheese), mat and mat' (checkmate/curse and mother), bit' and byt' (beat and be), semya and sem'ya (seed and family), brat and brat' (brother and take), pil, pyl' and pyl (drank, dust and fervor), mil and myl (dear and washed), ten' and den' (shade and day).
Conversely, many languages have fewer sounds than English. English has a relatively average number of consonants, but like most Germanic languages, it also has an unusually large number of vowels; most languages have significantly fewer. Many foreigners often cannot differentiate between vowel pairs like cold and called, worm and warm, bold and bald, say and see, ball and bowl, mint and meant. (Even native speakers don't always differentiate some of these.) Similarly, even though English has an average number of consonants, that doesn't mean that its particular consonants are commonly found in other languages. For example, languages with the English "th" sounds /θ ð/ are very uncommon, and the English "R" sound /ɹ/ is also rather unique to the language. (Most other languages have a tap or a trill.)

Useful Notes / Grammar in Foreign Languages

Previous

Index

Next

Useful Notes / Grammar in Foreign Languages

Edit Locked

Previous

Index

Next