Introducing Mycroft Translate

Originally published at: Introducing Mycroft Translate - Mycroft

“I just enjoy translating, it's like opening one's mouth and hearing someone else's voice emerge.” ― Dame Iris Murdoch


A few months ago, we outlined our desire to provide stronger language support for Mycroft, noting that localization support is hard. The desire from our Community for better language support, especially for French, German, Dutch and Italian, was echoed in the 2018 Picroft Survey results.

We're now delighted to announce the first release of Mycroft Translate - a platform for enabling the Community to help translate Skills into other languages.

Choosing a platform for translation

Our recent intern, Masters of Computer Science postgraduate student, Andrew Wilson, undertook an evaluation of platforms. We initially were attracted to Zanata because of its intuitive user interface and ease of setup, but eventually settled on Pootle. Pootle has a closer alignment with our existing technology stack, being based on Python and Django. We made the decision to backend Pootle on to Postgres as the database layer, again reflecting existing deployments. Pootle is highly configurable, and extensible. That means we can add support for additional languages in the future, particularly languages that may have only small groups of speakers around the world.

Languages chosen for the initial release

For our initial release of Mycroft Translate, we've chosen a small number of languages that are available for translation, as we are anticipating some 'teething problems' with the new platform, and want to constrain these.

Our decisions here are informed by a number of drivers; existing translations of Skills by the Community, our web analytics which tell us where in the world our Community is based, and additional factors - such as our desire to cater for more ‘niche’ languages that have limited populations - such as Cymraeg (Welsh), íslenska (Icelandic), andʻŌlelo Hawaiʻi (Hawai’ian) - but which are vitally important for the communities who speak them. At the moment, we’ve opened up two CJK-grouped languages, although we’re not sure exactly how well the platform will handle the non-Latin characters these languages require.

Quality assurance and approval workflow

Within the platform, we've implemented a workflow that ensures each translation is reviewed before being accepted. Over time, we anticipate building out volunteer translation teams for each language, with Community members able to curate provided translations. This ensures both Community ownership of language support with Mycroft, and enables strong quality control as people fluent in a language are overseeing how it is used with open source voice.

The roles and permissions within the platform are;

Role name Role permissions How ascribed
Registered users
  • Suggest a translation
At login
Language Team Member
  • Suggest a translation
  • Review a translation
By a Language Chair
Language Chair, ie Mycroft French Language Chair or Mycroft Portuguese Language Chair
  • Provide translations without review
  • Review suggested translations
  • Administer all translations for a language
  • Administer all user permissions for a language

By another Language Chair

or

By Mycroft Staff Member

How you can participate in Mycroft Translate efforts and help bring YOUR language to Mycroft?

Participating in Mycroft Translate is just one of the ways you can get involved with the Mycroft Community. Providing translations is very straightforward, as outlined below.

Create an account and sign in

Unfortunately we haven't yet integrated the Pootle platform with home.mycroft.ai - something we hope to do in the future. For now, you'll have to create a separate account on https://translate.mycroft.ai.

Mycroft Translate - Create account and sign in

Once you log in, you’ll be able to see the Projects available for translation. All the vocab words for translation from the Mycroft Skills repo on GitHub are in the mycroft-skills project.

Mycroft Translate - Projects available for translation

Choose the language and project you wish to provide translations for

Next, choose the language you'd like to provide translations for from the Language dropdown.

Then, choose the “mycroft-skills” project from the Project dropdown.

Mycroft Translate - Choose a project for translation

Next, select the Skill you wish to provide translations for.

Mycroft Translate - Choose a Skill to provide translations for

Next, choose ‘Continue Translation’.

Mycroft Translate - Continue translations

Then enter your translation, and press the ‘Submit’ button. It’s that simple!

Mycroft Translate - Enter translationAt the moment we’re unable to handle multi-line translations - for instance where a phrase or sentence needs to be used to translate a word from English, but are following up to see how we can enable this.

What's next on the Languages Roadmap?

Earlier in the year, we published our open and transparent Roadmaps for each of the key streams of the Mycroft ecosystem. For the Languages stream, we anticipate firstly building more automation into the existing platform - essentially being able to import new vocab and dialog from Skills as they become available, and also export translations from Pootle into GitHub via the GitHub API. This will likely take us a few weeks, as we envisage quite a few kinks to work through.

Want more information?

We have a dedicated email address for questions related to Mycroft Translate - you can mail us at translate@mycroft.ai. You can also check out the Languages topics on our Community Forum, or the Languages Channels in Mycroft Chat.
2 Likes

This is GREEEAT :slight_smile: I am Danish and would like to do my part of translating. So as soon Danish is in the choosen languages i would begin to translate.

2 Likes

Just to confirm, do you mean dansk sprog - I just want to be sure?

Yes - Dansk sprog - Danmark - Small Scandinavian country. You know H. C Andersen, LEGO and unfortunaly now “owner” og one of the biggest europeean money laundering scandals done by DanskeBank :frowning:

The one with the Australian Kronprincesse :wink:

Yep :slight_smile: Princess Mary

couple things about language before i start contributing

  • what tone of speech should we use? in portuguese (and persian and…) we have several degrees of formality, “you” has several corresponding words, we use very different words depending on whom we are talking to, what is mycrofts speech default audience? coworker, friend, parent, children, old person…
  • word gender, example “thank you” in portuguese, “obrigado” is male, “obrigada” is female, depending on mycroft’s gender only one of those should be used, and this needs at very least to be consistent across text (even if its a female voice using male words)

I agree with Jarbas_Ai

I tried some translations but it’s a headache if you want to do well.

We ignore the context. The file name might help in some occasions. But it’s difficult to guess the developer’s intent : why did he choose that word ?

My second concern is about Pootle which is hard to use in some case. It’s easy when translating on the fly especially if you only want to put a word in front. It took me some time to find the nuance between activate, trigger and fire because these words contains ideas about both actions and results. After translating them, the next words were around the term bright which I paid the same attention. And then, we came back to activate, trigger and fire but I was unable to retrieve my previous translations, lost my way and gave up.

Perhaps, I am wrong and it’s not what you want we do.

Same “problem” in German for translating “you” - there is the formal “Sie” (used to adress persons you don’t know, officials, etc.) and the informal “Du” (when talking to a close person, friends, family, …). As Mycroft will be my personal assistant I choose the informal translation.

Ran into the same problem. When in doubt I use the “suggestion” option flagging the translated phrase for further discussion. Another solution would be looking into the skills source code and try to understand what the author intended - but that will requiry knowledge of Python and is time consuming…

Hello, that is “fantastico”. I am doing my small part for the Italian translation. But I have some doubts on how to translate the regular expressions. Let me explain: if I have to translate a plain phrase like “Play this song” is very easy: I just write “Suona questa canzone” and it’s done.
Problems come with something similar: ((turn|switch) )?(?P<Action>on|off|toggle) (?P<Entity>.*)

Of course I understand that is relating to switching on and off an entity but here come the problems: in English you say “Turn on” and “Turn off” and also “Turn this entity on (or off)”.
In Italian we have a single word to say Turn on (Accendi) and Turn off (Spegni) and in general the rules for the phrase composition are a bit different. So the problem I am facing is how to reproduce a sintax like the one before provided that is obviously different. Let me share another example: the articles. Italian has not only one article but 8, counting the apostrophes, every one following some sort of rule on when you have to put it.
And here it is the question: a bare reproduction of ((turn|switch) )?(?P<Action>on|off|toggle) (?P<Entity>.) translating the words inside can be a non optimal (yet possibly initial) way of operating. How can we compose a new expression? As far as I understand a correct translation in Italian should result in something like this (?(?PAccendi|Spegni|Commuta) (il|lo|la|l’)(?P.) but how can I be sure of that? Is there someone that is thinking about the “italian speak engine”?

things to think of: speech enabled translation… w00t

@Jarbas_Ai @robb_nl @andlo @Dominik @piretro999

Thank you all so much for your helpful and constructive feedback - very much appreciated. Let me take the comments one by one.

Context

This is by far the most frequent feedback we’ve received about Mycroft Translate - it’s difficult to translate terms without having the context in which the translation occurs. We’re not sure exactly how to tackle this yet - without actually running the Skill, or having a web interface which allows you to hear how the interaction plays, it’s hard to infer context. So this one is “to do” - ideas as always warmly welcomed.

Translating technical terms like (?P<Entity>.) in phrases

Our online help will address how to handle this as well. In summary, the stuff in brackets ( ) or curly brackets { } should be left as is.

Guidance on tone, formality and level of politeness

Different languages often have different tones and levels of politeness in different situations. What we’re doing here is writing some inline help for Pootle that provides guidance on the best levels of formality and politeness to use. For now, if you’re unsure of the best translation, please just leave the phrase.

Additional languages

We’ve had several requests for additional languages, and we’re delighted to see so much additional interest! We need to work through the process for adding languages in our development environment, however we definitely plan to add more languages.

1 Like

Hello,

I just registered on Mycroft translate. I hope I will be able to make some contributions soon.

I have several questions first :

  • It seems the roles described here are not enforced. For example, just after registering, I am already able to put translations, without validation ( Language Chair role)

  • Is there a way to validate a suggestion ? Pootle documentation seems to show a green check to validate one, but I cannot see anything like this in my interface.

  • If we are all able to provide direct translation without going in “suggestion mode”, what is the correct (polite ?) behavior or workflow we should use ? Translation or suggestion first ? The first one make the work faster, the second one is more quality focused. (In my humble opinion, direct translation are a good choice for the beginning, in order to have a fully functional translated project fast. Quality could come after)

  • What is the process to merge translation in the main (git) project ? Is there some automatic tool we should wait for ? What is the frequency ? Should we make some git pull request with the translation files inside ? If so, what is the best way to retrieve data from Pootle and integrate it in the project source code ?

  • Related to the last question : could you enable the API access to Pootle data (https://pootle.readthedocs.io/en/stable-2.5.1/api/index.html) ? This should be useful for us to retrieve data automatically, and maybe make some tooling to put back translation file intto Mycroft.

Many thanks

On the sentence

tell me (the|our) (current )?(?P<Entity>.*) (status|state|value|sensor)

I need to put a word between (current) and (?P<Entity>.*), would it end like

(actual) de ?(?P<Entity>.*)

@forslund is the regex King :crown:, so he will have a better idea, however, I would combine actual and de;

(actual de) ?(?P<Entity>.*)

Best, K.

Did you put a blank space intentionally after the first question mark? I supposed ?(?P<Entity>.*) was an indivisible block.

Apologies, edited to have the correct syntax.

I’m in no way an expert but I think in the original:

(the|our) (current )?(?P<Entity>.*) (status|state|value|sensor)

the sequence "current " is optional using the ? (non-greedy)

So Something like this maybe:
(german) (optional-german )?word (?P<Entity>.*) (other german)

Not sure if all these german fields are sane but that’s the gist of it.

1 Like

I didn’t know about optional (it seems ? is always greedy) (my regex skills are very limited, just to grep some logs :stuck_out_tongue: ). It seems pretty useful, but it arises a question regarding the translations.

What we need to do to write a real question mark in pootle, perhaps a scape character like \?

In a .rx file it would be escaped as you suggest with a ? the other file types don’t use regexes…

well the spotify skill uses regexes in dialog files just to be confusing (and having a way of translating non-adapt regexes)

My regex skills are also very limited. There’s so much you can do with them.