Our project background:
We are working on an exciting project for education aiming to create an interactive way of learning via an inspirational edutainment package that introduce students to the technologies that will dominate their future including Immersive technology and gamification front end integrated to a backend of conversational AI platforms with curriculum models, General information bots (like wiki bot), and some entertainment deserts.
Coming from a struggling upbringing, it is important for us to make sure such service is available for all. So, the plan is to have cloud/hybrid version that saves the complexities of system set-up for schools where internet connection is not an issue, but also a full off-line version for remote locations where internet is a luxury or even non-existent. Our front-end is developed accordingly, and so we want our back-end to work.
Diving into Mycroft:
We went far with the development of our platform, but then we came across Mycroft! Our first impression is that it may take our project to a different level. It is obviously a matured solution, well structured, developed by a versatile team of experts and with a huge community behind it (We, on the other hand, are two people, only one can code and working part-time), and we are more experienced in gamification and immersive technologies than what the back-end requires⌠There are levels to this.
But letâs start with the major questions so we know if Mycroft is compatible with the needs of an education platform:
- Will we be able to host the whole platform on our cloud with no strings attached to any third party?
Thatâs important for many reasons, for example:
a) Heavy customizations: We like the structure of Mycroft. Out of the box, it is perfect for what it is meant for as a personal assistant. For education, as Iâll present later, we will have to subject it to heavey customizations from skills, to core, to permissions, and flow. For example, a user can install a new skill via a simple voice command. Great for personal assistant! But if we open that for students at schools without any control, imagine the circus (and the issues of installing unsuitable skills for certain ages). Over time, out back-end will keep having the Mycroft DNA but gradually shapes up to become a different system.
b) The Edmodo crisis: At some point, we will work with major partners like ministries of education to develop curriculum models, create special pairing policies (for example, system only accessible for verified schools/students, or maybe no pairing at all), etc⌠After the Edmodo education platform shutdown earlier this year, leaving 90 million users in the air, there is a big concern for major education players to invest the time and effort to adopt a platform they donât have a full control over its destiny. Many education sectors are yet to recover from that huge loss of retired Edmodo.
C) Training modules will be a more exclusive matter supervised by people with special permissions. We will even have many clones for the system for different scenarios. For example, a simple question like âdefine chemical reactionâ should provide different responses depending on the academic level/field of the user/student asking the question. It will be an answer that perfectly matches this student curriculum.
-
Once we reach a point where we need the hardware device, is it easy to configure your devices (like Mark II) to work with our cloud instances?
-
Can a standard VPS server run Mycroft ai platform or does it require special cloud solutions?
-
Integration and communication:
a) Assuming we use local STT (like Vosk for example) and a local TTS (Mimic 3, for example) packaged within our desktop front end application, is there some sort of API â or equivalent â that makes it possible for our application to send text utterance to our cloud Mycroft server (bypassing the cloud Mycroft STT), and get the response from the cloud as a text (by-passing the mimic 3 TTS that exists on the cloud instance)? Can we also send the targeted skill with the utterance to make sure we get the response from that exact skill model? For example, âwhat is chimerical bond?â utterance, will give one answer using wiki skill, and another by using a trained conversational ai model that we prepares for a certain academic level. If so, it will be great if we can get the direction to the mycroft api that handle that.
b) If we create a web application (we can most likely upload it into the same server directory of Mycroft if that makes things easier), will we be able to integrate that webapp to mycroft system so once a user opens the front end web application, mycroft starts listening, then convert speech to text, deal with the utterance, and send back a wave file response (instead of speaking it) to the front end to play it?
This is shamefully long already, so I will stop here. We are just careful not to move away from what weâve done already to something that looks way more amazing in mycroft but then get ourselves into a rabbit hole before discovering this amazing system is just hard to tailor to fit the requirements. Answers to the mentioned question will help us a lot to decide.