Let us expand the idea of text analysis derived from rule-based translation. Above is an example of a classic word-based search. In this particular case, it is the French word ‘été’. This word is ambiguous because it can be a common noun (‘summer’), or a past participle (‘been’). Below is an example of a search for the word ‘summer’ associated with the grammatical type ‘common noun’.
Finally, we have below an example of a search for the word ‘summer’ associated with the grammatical type ‘past participle’.
Rule-based translation is difficult to implement. The main difficulty encountered is taking into account the groups of words, so as to be on a par with statistics-based translation. The main problems in this regard are (i) polymorphic disambiguation; and (ii) building a fair typology of grammatical types. But once these steps begin to be mastered, there are many advantages. What seems essential here is that with the same piece of software, both machine translation and text analysis can be carried out. Among the modules that are easy to implement are the following:
type extractor: a module that allows you to extract words from a text according to their grammatical category
For the implementation of rule-based translation provides the machine with some inherent understanding of the text, in the same way that a human being does. To put it in a nutshell, it is better artificial intelligence.
Finally, other modules, more advanced, seem possible (to be confirmed).