There are sometimes false positives. Some words should remain untranslated, notably proper names. Interestingly, it is due to the fact that the english word ‘transport’ is the same in french: transport (fr) = transport (en) = trasportu (co).
Daily Archives: January 31, 2017
Proper noun elision
Testing #machine translation now facing new elision problem:
Riventosa (fr) = A Riventosa(co)
proper noun (fr) = definite article + proper noun (co)
it should read: in u paese di A Riventosa (without elision)
Elision rules are not trivial:
le village d’Arbellara (fr) = the village of Arbellara, should be translated as:
u paese d’Arbellara (with elision)
Rule-based translation : adjective accordance : interesting stuff:
sur les réseaux japonais et américain (fr) = annantu à e rete sgiappunesa è americana (co) = on the japanese and american networks (en)
noun (networks) is plural but adjectives (japanese and american) are singular
The excerpt refers to the ‘single) japanese network and the (single) american network
I guess (to be confirmed?) that the following sentence is ambiguous in english:
‘the japanese and american networks’: are there one or several japanese network(s)?
are there one or several american network(s)?
Now handling gender reversal:
– mer (FR, feminine) = sea (EN) = mare (CO, masculine)
– saveur (FR, feminine) = flavor (EN) = sapore (CO, masculine)
– liqueur (FR, feminine) = liquor (EN) = licore (CO, masculine)
‘c’est une bonne liqueur’ (FR) = ‘it is a good liquor’ (EN) = hè un bonu licore (CO) requires gender reversal of : definite article ‘un’ (instead of ‘una‘) + adjective bonu (instead of ‘bona‘)
‘la mer est belle’ (FR) = ‘the sea is beautiful’ (EN) = u mare hè bellu (CO) requires gender reversal of : definite article ‘u‘ (instead of ‘a‘) + adjective bellu (instead of ‘bella‘)
u mare hè bellu requires uppercase and should be written: U mare hè bellu.
Introducing new feature for #MachineTranslation:
some verbal locutions:
prendre d’assaut = assaltà
mettre à sac = sacchighjà
prendre au collet = incappià
Now considering the issue of Semantical disambiguation.
Some instances For French to Corsican are:
– ‘défense’ = sanna/difesa = tusk/defense
– ‘vol’ = bulu/furtu = flight/theft
– ‘comprend’ = capisce/cumprende = understands/comprises
– ‘palais’ = palatu/palazzu = palate/palace
– ‘expérience’ = sperienza/sperimentu = experience/experiment
Threefold ambiguity: French ‘nouvelle’
Let us mention the issue of threefold ambiguity: french ‘nouvelle’ can translate into:
‘nutizia‘ (‘piece of news’)
or ‘nuvella‘ (short story’)
or ‘nova‘ (‘new’)
The disambiguation between ‘nutizia‘ (‘piece of news’)
or ‘nuvella‘ (short story’) is semantic (hard)
while the disambiguation between ‘nutizia‘/’nuvella‘ (noun)
and ‘nova‘ (adjective) is grammatical (medium difficulty)
Further reflections on the definition of ‘above human level’ translation
Some further reflections on the definition of ‘above human level’ translation:
– the answer may not be based solely on the quantitative side, being of the type: ‘above 96%’, “above 97%’, ‘above 98%’, etc.
– it seems the answer should also incorporate insights from the qualitative side, i.e. not containing gross translation errors.
Semantic disambiguation errors would most oftenbe termed ‘gross errors’: for example, translating ‘défense d’éléphant’ into ‘elephant’s defense’ instead of ‘elephant’s tusk’
– to fix ideas, it could be proposed: ‘above human level’ =
above 98% AND without gross errors
Defining an instance of Feigenbaum test
Defining an instance of Feigenbaum test (from wikipedia: generally defined as a variant of the Turing test where a computer software attemps to imitate a human expert in a given field): Translating French into Corsican.
We expect 98% accuracy and lack of gross errors in order to pass this Feigenbaum test.
Scoring 1-10/159 = 93.71%. Partitive article successfully handled:
‘participe à de nombreuses batailles’ = participeghja à numerose battaglie
‘fournit des renseignements’ = furnisce i rinsignamenti
French ‘vis’ is multi-ambiguous
In the style of ‘I saw wood with a saw’, from French to Corsican:
French ‘vis’ is multi-ambiguous:
– ‘vis’ (noun singular) = vita = screw
– ‘vis’ (noun plural) = vite = screws
– ‘vis’ (present 1rst person) = campu = I stay, I live
– ‘vis’ (1rst person) = visse = I saw
‘Je vis à Londres’ should translate: ‘Campu in Londra‘.
Semantic disambiguation of ‘palais’
Testing the #semanticdisambiguation of ‘palais’
French ‘palais’ has fourfold ambiguity:
– palazzu (EN palace): noun singular
– palatu (EN palate): noun singular
– palazzi (EN palaces): noun plural
– palati (EN palates): noun plural
Le palais du calife est en feu.
The palace of the caliph is on fire.
L’incendie se déchaîne.
The fire is unleashed.
Il a avalé un piment entier.
He swallowed a whole pepper.