Myzilla! Mycroft at the Mozilla All Hands

Originally published at:

For the last year, Mycroft and Mozilla have been building a relationship based on our shared interests. I was invited to join them at their company get together last week and want to share info about the event and what we’ve been up to this year.

Mozilla All Hands

Every six months the entire global Mozilla organization gets together in one physical location to share knowledge via presentations and organized sessions, to plan the next six months, and to let collaborators meet face-to-face. As someone who has worked in remote teams for the past 20 years, I can attest to the value of being in the same space to build camaraderie and rapidly share ideas.

As part of these All Hands, Mozilla also invites a number of “volunteers” from outside the company. These are individuals who aren’t paid by Mozilla, but who work closely with the organization. This year, I was included in this group.

The Spring 2018 All Hands was June 11 - 15 in San Francisco. In all, approximately 1200 employees and 70 volunteers attended.

Overlapping Interests

There is quite a bit of common ground between what Mycroft and Mozilla are doing. Mycroft has, obviously, been working on voice interaction tech. Additionally, the level of access we have to hardware makes Mycroft a great platform for Internet of Things (IoT) interaction and control. We currently can combine voice with control of equipment like Hue lights, IoT hubs like Wink, or larger ecosystems like Home Assistant. Additionally, we are able to work with hardware directly, reacting to GPIO pins connected to switches and sensors. Plus we are able to communicate on local networks with equipment within the home.

These capabilities overlap with several Mozilla teams…

Machine Learning / DeepSpeech

Our tightest collaboration thus far is with the DeepSpeech team. Kelly Davis and crew have been implementing the DeepSpeech speech-to-text architecture, and Mycroft has been one of the earliest actual consumers of the technology. Mycroft can currently use DeepSpeech running on Mycroft Home cloud or from your own private instance.

We are also both working on implementations of the Tacotron text-to-speech architecture. Mycroft has a Mimic2 implementation and has created a complete 15-hour dataset we are using for a joint voice benchmark.

DeepSpeech is young (currently version 0.2.0-alpha.6) and still rapidly evolving on both the code and the published models. The current model is noticeably weak in noisy environments and with rapid, conversational speech. The Mycroft community is providing access to the kind of data needed to train a model to handle this. I’m very excited about what we can achieve jointly!

[caption id=“attachment_39078” align=“aligncenter” width=“2592”]

With the Mozilla’s Machine Learning team (missing Reuben, sorry!)[/caption]

Common Voice

Mozilla began Common Voice to gather the kind of language data needed for building technologies like DeepSpeech. While their CC0 (aka "public domain") data licensing model is different to Mycroft's OpenVoice dataset, the collaborative ethos is very similar. We are sharing technical and social learning about working with a community to achieve better things together in data gathering and tagging – for a spectrum of languages.

Through this team, I met with another volunteer, Dewi Jones of the University of Bangor in Wales. He and I had several discussions about what it will take to build a fully-functional Welsh Mycroft as part of their Welsh Language Technology program. FFantastig!

[caption id=“attachment_39077” align=“aligncenter” width=“2429”]

With the Mozilla’s Common Voice team[/caption]

Project Things

This new IoT framework looks to simplify and unify the physical world with web technologies into a Web of Things. Mozilla integrated Mycroft's Adapt intent parser into the platform several months ago to simplify working with all sorts of natural language commands. They are also working hard on their Things Gateway built on the Raspberry Pi where Mycroft/Picroft would obviously offer some powerful possibilities.

[caption id=“attachment_39079” align=“aligncenter” width=“4032”]

With the Mozilla’s Project Things team[/caption]


This caused a bit of a stir in the tech journalism world! I won't say much, but really Scout is just an early experiment in how consumers think of voice technologies and how they might interact in collaboration with, and independently of, a screen. Mozilla has spent almost all of its existence developing the technologies for today's web, and these are inherently visual. We are all learning how consumers think of speech differently than written text.

And More...

Both Mycroft and Mozilla are very focused on enabling and representing the user, not vice versa. This is a particularly tricky thing to do in the data-driven machine learning world. Our efforts in collaborative data gathering and tagging, remote and federated learning, data ownership and licensing are leading the way to a powerful AND ethical future.

I believe trust is a new economic benchmark, and one at which both our organizations excel.

I really enjoyed finally meeting old and new colleagues face to face and look forward to what we can do together in the future!


The Welsh Language project is exciting. It’s so cool to think of voice technology not just as a tool for increasing productivity and interaction, but also as a means for preserving languages and encouraging their use.

I imagine my English granddad will get a kick out of this; after a few bourbons, he’ll school us on the Welsh phrases he picked up while evacuated from London during WWII.

1 Like

Agree so much! My Dad’s family are from Northumbria, and speak with very broad Geordie accents. My Uncle spoke Pitmatic, he was from Ashington (Aeashingtun), and it really was like a whole another language - different vernacular, different intonation, different prosody, different cultural roots and biases. And it’s these languages that are in danger of dying out.

Closer to home in Australia there are over 700 Indigenous languages, and the need to keep those languages alive is even more pressing because many Indigenous peoples don’t have any written traditions - their entire history is oral; passed from generation to generation verbally.

While there are dangers of voice technology, the ability to reinvigorate and re-energise cultures is just one of the many benefits.

A’m aboot to hoy oot doon teh street; ta-ra for noo :wink: