Hoe kan een computer menselijke taal begrijpen? (Engelstalig)Human language appears in two different forms: spoken and written. While this article mainly focuses on written language, the techniques that will be discussed also apply on spoken language. Beside form there are two different modes in which human language can be used to communicate: by means of a dialogue and by text. Both will be covered in this article. How can a computer understand Human Language? Before we can answer this question, we first have to discuss different levels of language understanding. There are three main different levels of language understanding: - Level 1: Syntactic Understanding
- Level 2: Semantic Understanding
- Level 3: Pragmatic Understanding
For each level we will give an explanation and examples. Syntactic Understanding For each message we want to convey in human language, we have to follow a "communications protocol", which is defined in the grammar of the language. This grammar contains rules that state how words should be combined to construct valid sentences. For example, in English, an adjective should occur before a noun as in the phrase "big dog". Thus, a phrase like "dog big" is incorrect according to this rule. A sentence is a string of words, which have been combined according to the rules of the grammar. Thus, behind every sentence, there is an implicit structure implied by the grammar. We call this structure the syntactic structure of a sentence. Consider for example the following sentence: "John gives Mary a big book." Each word in this sentence has a certain role: | Word | Role | | John | PROPER NOUN | | gives | VERB | | Mary | PROPER NOUN | | a | ARTICLE | | big | ADJECTIVE | | book | NOUN |
Furthermore, according to the grammar of the language, the words combine into different phrases, which also have certain roles: | Words | Phrase | Role | | John | NOUN PHRASE | SUBJECT | | Amy | NOUN PHRASE | OBJECT | | a big book | NOUN PHRASE | INDIRECT OBJECT | | gives Mary a big book | VERB PHRASE |
|
The analysis we have performed here is called a syntactic analysis, and can be visualised in a hierarchical structure:  This structure is the syntactic structure of a sentence, and to be able to construct such a structure implies a syntactic understanding of the sentence. Note that the word "book" can act as both a verb and a noun. But in this particular sentence, "book" acts as a noun. This can only be determined after a syntactic analysis. A syntactic analysis can be performed by a computer by using a technology called Natural Language Parsing. Semantic Understanding Understanding the syntactical structure of a sentence does not imply an understanding of the meaning of a sentence (also called the semantics of a sentence). The next step is to determine the meaning of the phrases identified by the syntactic analysis and how they relate to each other. The following picture shows this:  The meaning of this picture can be described as follows: There are three different objects: "John", "Mary" and "book". These objects participate in a "give" relation in which "John" is the AGENT, "Mary" the RECIPIENT and "book" the INSTRUMENT . Furthermore, the "book" object is modified by the attribute "big", which indicates that its size is large. The analysis we have just performed is called a semantic analysis, and the picture a semantic structure. A semantic analysis can be performed by a computer using a technology called Semantic Analysis. Using other knowledge resources, the semantic structure can be decorated with more knowledge about the objects, deepening the level of semantic understanding: 
The added information in the semantic structure can be described as follows: "John" is a male and "Mary" is a female, which are both persons. The "give" relation is a transaction. Finally, a "book" has a cover and pages which consist of paper. A semantic analysis can be performed on a single sentence, but also on a text. Suppose our example sentence has been the first sentence of a text and that the next sentence is: "She tears a page from it." Our existing semantic structure can be augmented with the semantic analysis of this sentence: 
One can imagine that the semantic structure of an entire text can be constructed by repeatedly augmenting the semantic structure. The resulting "web" of objects and relations represents the meaning of the text, and being able to construct it implies a semantic understanding of the text. Purely as an illustration, the following picture shows a computer generated semantic structure: 
A very large and complicated web would be almost impossible for a human to untangle, but a computer can easily cope with it. Furthermore, a computer can perform all kinds of (mathematical) operations on the semantic structure, and in doing so, almost literally play with the meaning of a text! Pragmatic Understanding For the analysis of a text, a semantic analysis is usually sufficient. However, in dialogues, people rarely say what they mean. Consider for example the sentence: "Could you turn the light on?" Obviously, the speaker means something like "press the light switch", but the semantics of the sentence above is something like "are you capable of achieving to increase the amount of light?". The former meaning is called the pragmatic meaning, and the latter the semantics of the sentence. Sometimes, the difference between the pragmatic meaning and the semantics is even more far fetched, as in the sentence: "It is dark in here." Possibly, the speaker means exactly the same as in our first example: "press the light switch". To cope with these utterances, a computer has to make a mapping between the semantic analysis of a sentence and the pragmatic meaning: 
For a computer to be able to make this mapping, it has to understand that a light switch is a device to turn on the light, and that an observation about the low light level could be a request to increase it. To be able to construct such mappings implies a pragmatic understanding of the utterance. The technology to construct these mappings is called Pragmatic Analysis. The knowledge that is required to construct the mappings is called domain specific knowledge, or world knowledge. At the current level of technology, it is not yet possible to automatically construct domain specific knowledge. The technology to achieve this is still in its infancy, and is called Machine Learning. Therefore, domain specific knowledge has to be manually constructed for each specific domain. Consequently, it is not yet possible for a computer to converse about every conceivable topic. However, computers that converse about topics in a specific domain (for which the domain specific knowledge has been constructed) are definitely within our grasp! Conclusion Now let us return to the question: "How can a computer understand Human Language?". The answer to this question is now apparent: A computer can understand Human Language by applying a Syntactic Analysis, a Semantic Analysis, and optionally a Pragmatic Analysis. Glossary | proper noun | name | | article | "the" or "a" | | adjective | a word that modifies a noun by specifying an attribute. Example: "he hit the red ball" | | noun phrase | part of a sentence that describes a concept or object. Example: "he hit the red ball" | | verb phrase | part of a sentence that describes an action. Example: "he hit the red ball" | | subject | subject of the main verb. Example: "John gives Mary a book" | | object | first argument of the main verb. Example: "John gives Mary a book" | | indirect object | second argument of the main verb. Example: "John gives Mary a book" | | agent | the entity that performs the action. Example: "the red ball was hit by him" | | recipient | the entity that is the recipient of an action. Example: "he hit the red ball" | | instrument | the entity that is used as an instrument in an action. Example: "he hit the red ball with his hand". |
|