Category Archives: Semantics: blog

About the typology of machine translation systems

The distinction between rule-based and statistically-based translation may well be artificial and obscure what is really the interesting distinction in machine translation modules. The latter may well lie in the fact that some methods capture (at least partially) the semantics of a text, and are for example able to enumerate lemmas in the text, change […]


What are interjections (Hello! Good evening! Merry Christmas! Happy Birthday!…) in the present framework? They are words preceded by a punctuation mark (period, comma, exclamation mark, question mark, etc.) and followed by a punctuation mark.

An analysis of French word ‘très’

According to our analysis, the word ‘très’ is likely to occur in the following grammatical types: Adjective modifier: here, ‘très’ modifies the meaning of an adjective: très beau (very beautiful, biddisimu), très content (very happy, cuntentissimu) Adverb modifier: ‘très’ here modifies the meaning of an adverb: ‘très rarement’ = very rarely, raramenti; ‘très souvent’ = […]

Leaving ambiguity unresolved

Disambiguation is an essential process in machine translation. Sometimes, however, it seems more rational and logical to leave an ambiguity in the translation. This is the case when (i) there is an ambiguous word in the sentence to be translated; and (ii) the context does not provide an objective reason to choose one of the […]

Dictionary = Corpus?

As far as machine translation is concerned, it seems that the best thing is to combine the best of the two approaches: rule-based or statistic-based. If it were possible to converge the two approaches, it seems that the benefit could be great. Let us try to define what could allow such a convergence, based on […]

Grammatical taxonomy again: the case of prepositions

Let’s look at the translation of the word ‘whose’. Depending on the case, ‘whose’ can be a relative pronoun: ‘la difficulté dont je t’ai parlé’ (the difficulty I told you about), ‘voilà le professeur dont j’apprécie beaucoup les cours’ (this is the teacher whose classes I really enjoy.) or, more rarely, a preposition: ‘il y […]

Creating new grammatical types

Italian has ‘prepositions followed by articles’ (preposizione articolate). This is a specific grammatical type, which refers to a word (e.g. della) that replaces a preposition (di) followed by an article (la): il lo l’ la i gli le di del dello dell’ della dei degli delle a al allo all’ alla ai agli alle da […]

Evaluation of the performance after changes

Just performed a series of open tests, using the (pseudo-random) article of the day from wikipedia in French.The results are the following, concerning the Taravese version of the Corsican language:95,7695,7694,3495,7699,2595,0495,48that is to say an average of about 95%, taking into account that the ‘cismuntinca’ version generally obtains a slightly lower result, because of the masculine […]

Grammatical word-disambiguation again and again

The main difficulty here seems to lie in the adaptation of the grammatical disambiguation module. Indeed, for the French language, such a module performs disambiguation with respect to about 100 categories. The number of pairs (or 3-tuples, 4-tuples, etc.) of disambiguation, for French, is about 250. The question is: when we change languages, how many […]

First feasability test: dictionary morphing

The first test carried out to transform the dictionary (in the extended sense) based on the French-Corsican pair, into a dictionary related to the Italian-Gallurian pair, shows that it is feasible. The result – of an acceptable but perfectible quality – is obtained in 21 minutes (with 16 GO RAM & Intel core i7-8550U CPU). […]

Translation from Italian to Gallurese

Our new project will be to try to implement the translation from Italian into Gallurese. For this is an essential pair for the Gallurese language, which is a priority. The major difficulty in doing this is:– on the one hand, to (automatically) transform the dictionary (in the extended sense) based on the French-Corsican pair, into […]

Adjective modifiers again

We will consider again a category of words such as ‘very’, when they precede an adjective. Traditionally, this category is termed ‘adverbs’ or ‘adverbs of degree’, but we prefer ‘adjective modifier’, because (i) analytically, they change the meaning of an adjective and (ii) synthetically, an adjective modifier followed by an adjective is still an adjective. […]

On ‘reflexive pronouns’

Pursuing the reflection on grammatical categories, we will examine now “reflexive pronouns”. These are: me te se nous vous se (French) mi ti si ci vi si (Corsican) myself yourself himself/herself/itself ourselves yourselves themselves Let us take an example: je me promène, tu te promènes, il se promène, nous nous promenons, vous vous promenez, ils […]

Grammatical word-disambiguation again

The challenge is especially that of generalizing the grammatical word-disambiguation to several languages. Creating a module of grammatical word-disambiguation for each language appears to be a long and arduous task. This seems to be the main difficulty. But if a module specific to a given language can be generalized to several other languages, this could […]

First steps in gallurese language

The translator takes his first steps in translating from French into the Gallurian language. The first tests show a score of 75-80%, with many errors in grammar, spelling and vocabulary. It will be necessary to reach a score of 90% before the result can be published. The ideal would have been the Italian-Gallurian translation, but […]

Hinting at the Control problem

The question of choosing the best system to solve the problems posed by word disambiguation in the field of translation seems to be linked to the AGI control problem (how to avoid that an AGI finally turns out to be harmful for its creators). It seems that when we have the choice between several methods […]

On the implementation of grammatical disambiguation

Grammatical disambiguation – i.e. whether ‘maintenant’ is and adverb (now) or the gerundive (maintaining) of the verb ‘maintenir’ – seems to be the crucial issue for the adoption of the rule-based model or statistical model for machine translation. This problem is widespread and seems to concern all languages. For the French language, this problem of […]

The 90% rule

The translation from French to Gallurese is in progress and currently under development. An application for Android is first planned. It will be called ‘traducidori gaddhuresu’. Currently the French-Gallurese translator is undergoing testing. It will only be published if its performance (evaluated by an open test) is above 90%. This is a rule that we […]

A “traducidori gaddhuresu” in preparation

After the Corsican language, the second endangered language for which we would like to develop a translator is the Gallurese language (“traducidori gaddhuresu”). As far as the ‘traducidori gaddhuresu’ is concerned, we are considering an Android application and a Windows version. The priority pair for Gallurese is Italian-Gallurese. However, it will not be possible to […]