
There are sometimes false positives. Some words should remain untranslated, notably proper names. Interestingly, it is due to the fact that the english word ‘transport’ is the same in french: transport (fr) = transport (en) = trasportu (co).
There are sometimes false positives. Some words should remain untranslated, notably proper names. Interestingly, it is due to the fact that the english word ‘transport’ is the same in french: transport (fr) = transport (en) = trasportu (co).
Testing #machine translation now facing new elision problem:
Riventosa (fr) = A Riventosa(co)
proper noun (fr) = definite article + proper noun (co)
it should read: in u paese di A Riventosa (without elision)
Elision rules are not trivial:
le village d’Arbellara (fr) = the village of Arbellara, should be translated as:
u paese d’Arbellara (with elision)
Rule-based translation : adjective accordance : interesting stuff:
sur les réseaux japonais et américain (fr) = annantu à e rete sgiappunesa è americana (co) = on the japanese and american networks (en)
noun (networks) is plural but adjectives (japanese and american) are singular
The excerpt refers to the ‘single) japanese network and the (single) american network
I guess (to be confirmed?) that the following sentence is ambiguous in english:
‘the japanese and american networks’: are there one or several japanese network(s)?
are there one or several american network(s)?
Now handling gender reversal:
– mer (FR, feminine) = sea (EN) = mare (CO, masculine)
– saveur (FR, feminine) = flavor (EN) = sapore (CO, masculine)
– liqueur (FR, feminine) = liquor (EN) = licore (CO, masculine)
‘c’est une bonne liqueur’ (FR) = ‘it is a good liquor’ (EN) = hè un bonu licore (CO) requires gender reversal of : definite article ‘un’ (instead of ‘una‘) + adjective bonu (instead of ‘bona‘)
‘la mer est belle’ (FR) = ‘the sea is beautiful’ (EN) = u mare hè bellu (CO) requires gender reversal of : definite article ‘u‘ (instead of ‘a‘) + adjective bellu (instead of ‘bella‘)
u mare hè bellu requires uppercase and should be written: U mare hè bellu.
Introducing new feature for #MachineTranslation:
some verbal locutions:
prendre d’assaut = assaltà
mettre à sac = sacchighjà
prendre au collet = incappià
Now considering the issue of Semantical disambiguation.
Some instances For French to Corsican are:
– ‘défense’ = sanna/difesa = tusk/defense
– ‘vol’ = bulu/furtu = flight/theft
– ‘comprend’ = capisce/cumprende = understands/comprises
– ‘palais’ = palatu/palazzu = palate/palace
– ‘expérience’ = sperienza/sperimentu = experience/experiment
Let us mention the issue of threefold ambiguity: french ‘nouvelle’ can translate into:
‘nutizia‘ (‘piece of news’)
or ‘nuvella‘ (short story’)
or ‘nova‘ (‘new’)
The disambiguation between ‘nutizia‘ (‘piece of news’)
or ‘nuvella‘ (short story’) is semantic (hard)
while the disambiguation between ‘nutizia‘/’nuvella‘ (noun)
and ‘nova‘ (adjective) is grammatical (medium difficulty)
Some further reflections on the definition of ‘above human level’ translation:
– the answer may not be based solely on the quantitative side, being of the type: ‘above 96%’, “above 97%’, ‘above 98%’, etc.
– it seems the answer should also incorporate insights from the qualitative side, i.e. not containing gross translation errors.
Semantic disambiguation errors would most oftenbe termed ‘gross errors’: for example, translating ‘défense d’éléphant’ into ‘elephant’s defense’ instead of ‘elephant’s tusk’
– to fix ideas, it could be proposed: ‘above human level’ =
above 98% AND without gross errors
Defining an instance of Feigenbaum test (from wikipedia: generally defined as a variant of the Turing test where a computer software attemps to imitate a human expert in a given field): Translating French into Corsican.
We expect 98% accuracy and lack of gross errors in order to pass this Feigenbaum test.
Scoring 1-10/159 = 93.71%. Partitive article successfully handled:
‘participe à de nombreuses batailles’ = participeghja à numerose battaglie
‘fournit des renseignements’ = furnisce i rinsignamenti
In the style of ‘I saw wood with a saw’, from French to Corsican:
French ‘vis’ is multi-ambiguous:
– ‘vis’ (noun singular) = vita = screw
– ‘vis’ (noun plural) = vite = screws
– ‘vis’ (present 1rst person) = campu = I stay, I live
– ‘vis’ (1rst person) = visse = I saw
‘Je vis à Londres’ should translate: ‘Campu in Londra‘.
Testing the #semanticdisambiguation of ‘palais’
(EN palace/palate)
French ‘palais’ has fourfold ambiguity:
– palazzu (EN palace): noun singular
– palatu (EN palate): noun singular
– palazzi (EN palaces): noun plural
– palati (EN palates): noun plural
Le palais du calife est en feu.
The palace of the caliph is on fire.
L’incendie se déchaîne.
The fire is unleashed.
Il a avalé un piment entier.
He swallowed a whole pepper.