Useful Notes: Chinese Language

Good God, China. All about symbols. Couldn't even make the alphabet.

This article will focus on Mandarin Chinese, the most widely-spoken language in China. There is another article on the other languages/dialects, including Cantonese.

Mandarin is generally considered a 'difficult' language for Westerners to learn. There are a number of factors behind this, but one of the primary reasons is probably the writing system, which has a large number of characters in comparison with most other languages. These are known as 汉字/漢字 hnz, literally meaning "Han characters", or "characters of the Han people", the Han being the largest ethnic group within China (and, indeed, the world); these same characters remain at the core of the Japanese Writing System (where they are known as kanji), and once formed the principal writing systems for the Korean and Vietnamese languages (which pronounced the same word as hanja and hn tự). There are two main variants on the character set, the "simplified" set used in Mainland China, Malaysia and Singapore, and the "traditional" ones, used in Taiwan and Hong Kong. (These differences are covered in more detail later in the page, suffice it to say, choosing the appropriate set to use can become Serious Business.) On this page, characters which differ between the two are presented with the simplified version first, for example: 发/發 . (If there's only one character given, it can be assumed that it's the same in both cases.)

Further difficulties stem in part from the fact that Mandarin includes a number of phonemes (sounds, basically) not found in, say, English. For example, Standard Mandarin has two distinct sh sounds where English has only one. This can work the other way as well, creating that 'flied rice' accent. While some people may think that r and l might be allophones, such as in Japanese, it isn't quite as simple as that. Standard Mandarin does have twonote  distinct sounds corresponding to the English r and l. Although generally, the Mandarin r can be quite different to the English one, and has a 'buzzy' quality, sounding like something between a French r sound, and the s in measure, depending on the dialect; in Standard Mandarin, it is usually the voiced retroflex sibilant, which speakers of Polish should recognize as the consonant "rz". Most Mandarin speakers should be able to perceive l and r as distinct sounds, but they may have difficulty pronouncing them in consonant clusters, which are common in English but don't occur in Mandarin at all. So it's quite possible that a Mandarin speaker would struggle with saying words such as flight and fright distinctly, but not with lice vs ricenote  The consequences of this for romanizing Chinese are discussed in another article. (But to cut a long story short, most of the tonal languages in the world, and there are a couple of thousand of them, use the Latin alphabet without difficulty; Vietnamese is a good example. The same applies to monosyllabic languages, although they are fewer and many of the best known examples use their own script - not necessarily pictorial, Burmese, Thai, Khmer and many others have an alphabet of their own or at least a syllabic script. On the other hand, most tonal languages don't have a native script that evolved along with the language and is structured around its needs.)

Further, Chinese is a tonal language. Mandarin Chinese uses four tones and a neutral tone. A student not accustomed to tonal speech can easily mishear what is intended or form strange malapropisms just by not paying attention to the tone. As an example, the words for 'mother' (妈/媽 ), 'to scold' (骂/罵 m), 'hemp' (麻 m), and 'horse' (马/馬 ) are distinguished only by a change in tonenote  (now there's an international incident just waiting to happen).

In addition to confusion between tones, Mandarin has true homophones, which actually do sound alike, including tone. In one slightly odd case, the words for "needle" (针/針), "gizzard" (胗), "rare" (珍), and "true" (真) are pronounced alike (zhēn). There are even multi-syllable true homophones; for example, qīngdn can mean "light in flavor/faint" (清淡), or "h-bomb" (氢弹/氫彈) depending on the characters used to write it. With insufficient context or attention, even native speakers can sometimes mistake one another's meaning, but no more often than in any other language. Having an idea of which words and phrases are most commonly used by Mandarin speakers, and in which contexts, goes a long way toward getting a better understanding of what's being said (although again, this is probably true for other languages as well).

In addition, Chinese is almost completely uninflected. There are no verb or noun endings to reflect tense, number or grammatical case. One exception is the 'word' 们/們 men which is attached to pronouns note  to indicate a plural: (我 ('I') becomes 我们/我們 wǒmen ('we')). A verb's tense is indicated by context, usually by stating when it was done or will be done; this can be construed as Chinese having only three tenses; past, present, and future, with various offshoots. Aside from this idiosyncracy, word order is usually similar to English's subject-verb-object order. In fact, a sentence written in English and translated word-for-word into Mandarin might look a bit odd to a native speaker, but would probably be perfectly understandable. To give an idea of this, the sentence 我跟朋友走去公园/我跟朋友走去公園 wǒ gēn pngyǒu zǒu q gōngyun would translate word-for-word into the odd-sounding "I with friend walk go park" ("I'm walking to the park with my friend").

The third-person pronoun, has separate written forms for male, female, and neuter which sound exactly alike when spoken. Before modern times there was actually only one written form of third-person pronoun 他 'tā'', different written forms were introduced from western languages. This can lead to Pronoun Trouble for native speakers of Chinese who learn new languages with gendered pronouns.

The 'classifier' or 'measure word' is yet another feature likely to give trouble to students of Chinese. These are a class of nouns which can have very general meanings, and in fact, can in many cases simply be omitted when translating Chinese. Nevertheless, they are still an essential part of Manadarin grammar. Simply put, it indicates the class of objects to which a number refers. English does it on occasion; you say you have "four loaves of bread" instead of just "four breads." Well, in Chinese, you have to do that with everything, which is simultaneously more nitpicky and more precise. Four trees would be 四棵树/四棵樹 s kē sh while four cars would be 四辆车/四輛車 s liǎng chē. 四 S means four, 树/樹 sh means tree, and 车/車 chē means car; 棵 and 辆/輛liǎng are the measure words. Using the wrong measure word for something can be a bit embarrassing ("I have four terabytes of bread"), especially if one uses one of the words for animals on people instead ("I have four flocks of priests"). Thankfully, the measure word 个/個 g can stand in for nearly any other measure word in a pinch, functioning in many ways as a generic measure word (eg. 我有四个朋友/我有四個朋友 - "I have four [units of] friends").note 

Interestingly enough, exclamation points and question marks can be included as words in the sentence. These are known as particles, and are typically added to the end of sentence. The a (啊) (pronounced 'ah!') sound that Chinese people supposedly make expresses surprise, doubt, agreement, or affirmation depending on the tone used. ma (吗/嗎) (yes, another word to mix up with 'mother' and 'horse'note ) is used to express a question. There are many other useful particles, including 吧 ba, which is used to imply politeness when making suggestions. There is also 呢 ne, which roughly means "How about...", which is commonly used when response to "你好吗/你好嗎? (nǐ hǎo ma)" ("How are you/are you well?") - "好,你呢? (hǎo, nǐ ne)" ("I'm fine, how about you?")

The MediaGlyphs Project is a cross-cultural language that uses Chinese grammar and easily recognized pictures to aid in translation from one natural language to another.

Chinese Writing

Chinese is a logographic language, using thousands of characters, each of which represents a syllable, although often two characters are pronounced the same but have different meanings. Only the most basic characters, such as the ones for 'sheep' or 'door', bear even a vague resemblance to physical things.

The writing system is ancient: dating back at least five thousand years, it is the oldest writing system which still known to be in current use. As one might expect, many, many styles of writing the characters have been developed over the years; the split between Traditional and Simplified Chinese is the most recent. When written in vertical columns, Chinese is generally read from top to bottom, right to left. When written horizontally, as on a shop sign, it is usually read from left to right, though older signs (on temples, for example) may read from right to left.

Despite the daunting number of characters, there is a certain twisted logic to their construction. Many of the more complex characters are composed of simpler characters and these often give a clue as to the meaning or pronunciation of the whole. Indeed, when describing a character that has several homophones, such as a surname, one often lists the component pieces if precision is necessary or the character is uncommon. Just to add to the confusion, some characters have multiple readings and the pronunciation 'hints' may not be valid in modern speech. Unlike with Japanese, Alternate Character Readings are rarely drawn on for puns.

Generally, upon meeting an unknown character, one looks at the uppermost or leftmost element for some clue to its meaning. There are a number of standard 'heads' and sides ("radicals") that indicate general categories like the 'Grass' radical for plants (other than trees) or the 'Walking' radical for movement. The words for 'to flee' and 'peach' are homophones. Both characters are composed of two components: 'Peach' (桃 'to') has a 'wood' radical on the left, while 'flee' (逃 'to') has the walking radical; both share a common element on the right (兆 zho, giving some indication of the pronunciation).

In writing hanzi, the strokes that make up each character have a set order. In general, a character is built up from left to right, top to bottom, though certain radicals are always written after the the rest of the character is complete. This is most important in calligraphy and somewhat superseded with the introduction of pinyin-based typing.

Each character generally corresponds to a single sound or 'syllable' in spoken Chinese, which means that even a relatively short line of dialogue can span the entire screen when close captioned. While each character can have an intrinsic meaning, many 'words' are short phrases consisting of multiple characters and similar phrases can have widely different meanings. For example, 火车/火車huǒchē (lit., "fire vehicle") means train while 救火车/救火車jihuǒchē (lit., "help fire vehicle") means fire trucknote . One particularly cute compound is the word for panda: 熊猫/熊貓 ''xingmāo', which has the literal translation of "bear cat".note 

As the language lacks an alphabet, Chinese dictionaries tend to be organized by stroke number and the radicals mentioned earlier (as well as other commonly used elements), though the index may also use the sound, often in pinyin or bopomofo. Ironically, this can make simple characters harder to look up. ("Pinyin" is the pronunciation-guide-using-Latinate-alphabet stuff you've seen scattered throughout the article; the Japanese equivalent is Rōmaji. "Bopomofo" is a transliteration system, with each character described by a mark for a consonant and a mark for a vowel.) However, one could argue that with the advent of advanced, smartphone-backed Chinese dictionaries, such as Pleco, that enable the user to manually input (by means of hand-drawing) the character they want to look up, the traditional paper dictionaries are bound to be rendered useless in a short period of time. It should also be noted that the common view of Chinese as a language whose mastering is next to impossible has always been based not necessarily on the difficulty that accompanies understanding the meaning of so many characters per se but the time it takes to look them up; we can thus say that mastering Chinese in the 2010s is a lot easier a task than any time before.

Further complicating the issue is that several different romanization systems have been created for Chinese over the years. This is discussed in detail at another article, but suffice it to say that "kung-fu" and "gōngfu" are different romanizations of the same two-character word (功夫).

The history of Chinese is a long one, 4000-years-old pictograms carved into tortoise shells have been been discovered throughout China. Chinese is generally divided into the Classical and the Modern period, with Classical being everything before the fall of the Qing Dynasty in 1912 and Modern being everything after. As it has been noted before, the needs of administering such a large empire largely prevented the official Classical or Literary Chinese from mutating. Bureaucrats and poets self-consciously modeled their writing on the grammar and style of the Spring and Autumn period (771-476 BCE) and on the writings of Confucius in particular. While new words and grammar are continuously introduced to the Literary language from the spoken dialects, but at a glacial pace (kind of like how modern Church Latin is different from the Latin of Caesar, but still looks Latin). But in the waning years of the Qing, reformists criticized the increasingly impenetrable Literary Chinese for creating a gulf between the largely illiterate masses and their literate overlords and held it up as one of the reasons of why China has failed miserably in modernization in face of colonialist encroachment. Reformist authors, like Lu Xun advocated the use of spoken Chinese as the basis of written language and popularized the use of Baiyu (literally, the plain tongue based on the Beijing dialect) through his novels and essays—many of these authors would go on to advocate for the simplification of the Chinese script and the use of Pinyin. In the modern era, use of Baiyu has been encouraged by the Nationalist, then the Communist administrations. They have brought even closer to the spoken variant.

This language provides examples of:

  • Alternate Character Reading: Some characters have multiple meanings or sounds. Sort of like the English stock or lead or maybe minute.
    • Hanyu Pinyinnote  has shades of this too - it uses the Latin alphabet to represent the pronunciation of Chinese words. In most cases, this is fairly straight forward: for example ping and ban are pronounced pretty much as you'd expect. However, there are a few letters which are used to represent non-English sounds - sounds like ci or quan aren't pronounced at all like their spelling might suggestnote .
  • Broken Base: The supporters of Traditional versus Simplified characters. the simplified supporters like the fact that they can write a paragraph in half the time and not having the characters turn into illegible inkblots when the font get too small. While the traditionalists like the hints to meaning and pronunciation that the oldstyle characters contain and the link to history that it provides. Which one is easier to learn and remember is the subject to much debate—which will not be done here. Also, as the simplification scheme is promulgated by the mainland communist government, the people of Taiwan/overseas Chinese takes slight offense due to political/ideological reasons instead of anything linguistic.
    • Insistent Terminology: The official Chinese term for Traditional characters in Taiwan means "Standard/Orthodox characters", while elsewhere in the Sinophone world they are generally referred to as "Complex characters" (as the character used, 繁, carries the connotation of "frustratingly" complex). Some Simplified supporters insist on using the term "Complex" in English as well, considering the term "Traditional" a misnomer, due to the fact that some characters had been made more elaborate over time and that many Simplified characters are based on traditionally used abbreviated forms.
  • Common Tongue: As indicated by two of the names cited below, Mandarin serves this purpose in a linguistically diverse China.
  • Four Is Death: The Chinese language may be the Trope Maker.
  • Foreigners Write Backwards: As mentioned above, Chinese is generally written vertically, with columns read top-to-bottom from right to left. Seals may even be read in a circle, as shown by the first image at the Other Wiki.
  • I Have Many Names: The most common are Putonghua ("Common Speech") and Guoyu ("National Language"), used on the Mainland and Taiwan, respectively. In parts of the diaspora, Huayu ("Chinese Language", Hua being a name for Chinese culture) is common. Finally, the word "Mandarin" is a rendering of Guanhua, "the speech of officials" from a time when it was the language of government functionaries based around Beijing.
  • Loads and Loads of Characters: The language has over 40,000 characters with a college grad knowing about 5,000. You only need 200 to 500 or so for a basic conversation or skimming a newspaper. As described above, there are rules for deducing the pronunciation of an unknown character, though they are not completely foolproof.
    • In fact, there are characters that have no pronunciation, like 込 (used in Japanese).
  • New Media Are Evil: Many old fogeys have claimed the advent of pinyin-based character inputting has led to a loss of literacy among the younger generation. Whether or not the use of this technology actually has any impact on literacy skills has not been proven.
  • Pronoun Trouble: Primarily in translation since the (spoken) third-person is gender-neutral.
    • In written Chinese "he" and "she" are fairly intuitive, with the left-side radical being "woman" for "she" and "person" for "he" while sharing the same root/base. However, the word for "it" does not resemble the others in any way— as it was originally the Chinese word for "others." (aside- in Traditional Chinese, even "you" is gender-specific).
      • The second-person is gender-specific in Taiwan but not in Hong Kong; people in Hong Kong were never taught the female ni.
    • When the overzealous language reformers made up genitive third-person pronouns, they also made up a ta for animals; a ta for all inanimate it, and a ta for gods.
    • On the other hand, when translating into Mandarin, you could say Pronoun Trouble is averted for many of the same reasons. If you know 4 syllables: 我 - I/me, 你 - you, 他 - he/him, 们 men (plural suffix for making we/us, (all of) you, and they/them), then you know every pronoun you commonly neednote . And if we throw in the possessive suffix 的 de, we can also express 'mine', 'ours', 'yours', 'his/hers/its' and 'theirs' as wellnote .
  • Pun: Those four tones and the sheer number of true homophones make for loads and loads of these. There's an entire class of jokes called xiehouyu whose punchlines often rely on wordplay.
    • The character for "spring" written upside-down is sometimes seen around the Chinese New Year because this was traditionally considered the start of spring. As it happens, the words for upside-down (倒) and "to arrive" (到) are homophones (do). So "spring" upside-down = "Spring has arrived."
    • A ridiculous number of Chinese superstitions are based on homophones. There is the Four Is Death as mentioned above, other example include: pears should never be served at a wedding because the Mandarin for pear (梨, l) sounds the same as the word for separation (coincidentally, part of the compound word meaning "divorce"). And fish is usually eaten for New Year's Eve dinner as the the word for fish (鱼, y) and the word for surplus, i.e. you ended the year with more than you started, are homophones.
    • These are also exploited to avoid censorship: homophones and near-homophones are used to get around government filtering on certain character combinations. The government sometimes cottons on to particularly widespread workarounds, but all in all it's a game of cat and mouse in which the Chinese Internet is always two to three steps ahead of the censors.
    • 鸡: It means chicken (in the sense of the fowl). Change it slightly to make 鸡巴, cock (yes, Gag Penis). And there's also a meaning of "hooker".
  • They Changed It, Now It Sucks:
    • Some supporters of the traditional characters consider Simplified Chinese to be an example, despite the fact that traditional characters are perfectly legible to those who have learned simplified (with a bit of practice) and vice versa.
    • And on the mainland, the second round of simplified characters was eventually withdrawn, after causing 9 years of widespread confusion and disagreement. (More info here.)
  • What Could Have Been: Throughout the 20th century (especially the former half of it) many political and intellectual figures of China would bring up the idea that characters be abolished in favor of a syllabary, akin to the Japanese kana scripts, or the Western alphabet. Whereas most of them acknowledged that characters formed an important part of Chinese culture, they argued that the fact that it takes forever to learn them would mean that China was never to become a fully literate country. Some also claimed that characters were difficult and time-consuming to type on a typewriter (which, by the way, is actually possible) and thus present an obstacle in the way of the country's development. Mao Zedong himself was personally of the opinion that pinyin would eventually replace characters as the sole means of writing Chinese but he did not do much to actually make that happen. These days, however, nobody seems to talk about it anymore: China is almost as literate as the countries of the West (about 90% of population is able to read and write and illiteracy is basically unheard of among teenagers; it should be noted that almost all of the most illiterate countries in the world actually use the Western alphabet) and characters are easy to input into modern computers.