Languages: ambiguous parsing

There is one reason computers are great at numbers and awful at languages: the latter are difficult to parse. While complex mathematical operations can be carried out in a well-known order, parsing text can be exruciating difficult even for humans.

This is especially true for languages — such as English — that allow long sequences of words to be joined together without prepositions, and that use the same word both as a noun and as a verb.

Languages: the strange case of Pirahã and Aymara

In my last post, I wrote about the connections between language and thought, ie. linguistic relativity / determinism.

In today’s highly globalized world, languages get mixed and evolve at a much faster pace than ever before. English, for instance, is no longer only divided into British, American and Australian English; we could say that there is a variety of English for any other natural language: Spanglish, Chinglish and so on. When French was the de-facto lingua franca of diplomacy (and, by extension, of Western Europe), it was not substantially modified by other local languages. When English replaced it, after World War I and especially after World War II, it started changing immediately.

English, especially its American variety, was not only originally used for international diplomacy; rather, as the United States rose a superpower in many fields (technology, business, etc.), one could argue that its language became widespread from the bottom-up. The average Joe in most other Western countries was exposed to American words: they wore blue jeans, they put coins into juke-boxes, they went to a bar. As English words became naturalized over time, this ultimately led to the creation of what could be easily considered a series of creoles that are for the most part mutually intelligible.

Languages: linguistic relativity, words vs. thought

One of the most intriguing concepts in linguistics is the so-called Sapir-Whorf hypothesis, or linguistic relativity principle. Simply put, it states that the language we speak can influence the way we think. Another common name for this theory is linguistic determinism. There are some subtleties in the usage of these different names (no pun intended), but in order to avoid confusing them and giving wrong information, I’ll refrain from attempting. There are many resources online about the details of this topic for those who wish to delve deeper. For the sake of this post, I will freely use the terms interchangeably.

Anybody who studied a foreign language, even without reaching fluency, has most likely had an experience with the linguistic relativity principle. The farther the language in question is different from the native language, the more the phenomenon is obvious.

Analysis of a misspelling

Some time ago, Lamebook showed a picture that captured my attention. Here it is:

It seems to me that the author of the message is not even a native English speaker. The syntax of the phrase is unusual; nobody fluent in the language would say “I do apologise,” unless someone complained about not getting an apology in the first place. Moreover, while “inconvenence,” “mechines” and “workin” might be a direct spelling of the local parlance, there is no way that “apologise” would be written “apploiges.” Misspellings are always homophones or quasi-homophones of the correct attested variants, but “applogies” has an entirely different pronunciation than “apologize.”

What is interesting to note is that the author might however be familiar with the British usage of the ending -ise. The caption of the picture does indeed mention KFC Byker, and Byker is a ward of Newcastle upon Tyne in England. On the other hand, the -s ending in “applogies” might stem from confusion the plural ending; even in that case, though, the unlikely singular “applogy” was pluralized correctly, rather than turning into “applogys.”

Also note that the author has no problems writing shorter words such as “about,” “thank,” “but” and the never-mistreated-enough “are,” which oftentimes magically turns into “our.” It is indeed a fact that shorter words are more easily remembered, at least because they tend to be more common. In any case, I am entirely unable to guess where the author of the sign might be from.

In any case, rather than the misspellings, what I find annoying is the comment of the person who posted (and presumably took) the picture: “The intelligence levels at kfc byker are sooo high! Lmfaooo.” The person who wrote the sign is ignorant, in that he or she doesn’t know English well enough, but talking about lack of intelligence is a bold and inappropriate claim at least. That might make sense (from the point of view of logic) only in case someone keeps making the same spelling mistakes over and over, even after being instructed properly.

The line between completely different concepts should not be crossed. Intelligence and ignorance are not the same. Saying so — or implying so — is not only Orwellian, but also plain wrong. At least the person who misspelled the sign is likely a foreigner and can be excused!