Hey guys, we released some of the secret sauce today - the Adapt Intent Parser!
You can find it here: https://adapt.mycroft.ai
Here is the accompanying video:
Hey guys, we released some of the secret sauce today - the Adapt Intent Parser!
You can find it here: https://adapt.mycroft.ai
Here is the accompanying video:
Great, will peruse through it tonight. Had a quick look for now though - just wondering, though, is there one single *intent_parser.py file, or does it work with a bunch of them? Where will they live?
I assume we can add our own, as that seems to be a lot of what this project is all about, but will there be a core of immutable ones or is everything open for editing? Will you be providing a platform to showcase and share useful âintentsâ, or is that what github is for?
Good work, by the way
Had a quick look in to examples provided with Adapt. Great work !!!
I have similar questions, what @Autonomouse already asked + few more.
Writing a Intent parser, will it be a manual process or automated using data from a database/excel sheet?
Who will decide which is ârequiredâ and which is âoptionalâ data for an intent parser ?
Will design Mycroft Intent Handler first and then design Intent Parser ?
Whether there any second intent parser use in Mycroft Intent Handler if (First) Intent Parser provides little data ?
For ex: âPlay Songâ (Artist is optional here)
Any plan to handle ambiguous results ? (Else user needs to keep track of require keywords used by Intent Parsers)
See the below example, I want to listen song but instead I am getting Tokyo Weather
âplay tokyo weather masturi songâ
{
âintent_typeâ: âWeatherIntentâ,
âWeatherKeywordâ: âweatherâ,
âLocationâ: âtokyoâ,
âconfidenceâ: 0.3440860215053763,
âtargetâ: null
}
How same Intent Parser will handle phrases like "Set First floor Office Temperature to 20"
and âSwitch On Ground floor Kitchen Table Lightâ and "Switch Off Second floor Bathroom Heater" ?
So many required keywords !!
[Set/On/Off, Ground/First/Second Floor, Kitchen/Office/Bathroom, Heater/Cooler/Ceiling Light/Table Light]
Above all best results from Intent Parser depends on WER of STT Engine.
Great work btw, I am now going to start using it !!!
Hey Guys, thanks for the excitement!
To answer I think both your question, Mycroft proper has a concept of skills, or âthings I can doâ, and the invocation of those skills is an intent. The intent parser will be defined by the skill developer. There will be some skills/intents that are part of Mycroft proper, and there will be some platform level intents (telling mycroft to go to sleep, for example). The rest of the intents will be defined by developers, and mycroft will make a best-effort to determine the appropriate intent for a given query, and route it back to the skill that registered the intent parser.
As for public-domain or out-of-box intents, Iâve been thinking about that a bit today. Right now, the samples are all weâve given to the world, and we expect people to create their own and populate them with data as they see fit. I imagine some datasets will be dynamically created/maintained at runtime (think the list of a userâs pandora stations), and others will be static (like a list of US Cities, or top 100 business names). Some of these I can imagine Mycroft maintaining for the public, though your mileage will vary based on the skill/application you are developing.
If thereâs interest, I think we could start corralling a set of public intents/vocabulary as part of adapt-intent-* projects, release/manage them independently, and allow for end developers to have an experience like the following:
pip install adapt-parser adapt-intents-music
adapt-vocab-downloader music en-US
and then start developing against a populated intent parser.
Thoughts on this?
To answer specifically your question about the multiple intents:
I think youâre using the multi_intent_parser example here, but Iâm not sure, and Iâm having trouble following your question.
The examples are very contrived. In reality, your engine will be populated with a music catalog as well if youâre hoping for higher accuracy, which will help Adapt to disambiguate between the weather intent and the music intent. Iâm not familiar with the song youâre referencing, and I would guess that you have not registered it as vocabulary in your engine. If this is a contrived song name, Iâd ask you to be more fair, but frankly song titles are extremely ambiguous
Since Adapt is a known entity parser (and not a statistical parser), without any indication that âtokyo weather masturiâ is related to music, the weather intent would be tied with the music intent, and while adapt is deterministic,the winning result will be rather arbitrary. Adapt does also support returning multiple parse results, at which point your application could bubble up a response of âWhich of these things did you mean?â
Sorry for any confusion, but the examples were intended to be instructional, and not drop in parsers for their domains.
Take this example.
Home Automation Intent Parser.
Automation Phrases used will be like below:
Automation Intent Parser:
action_keyword = [âSetâ, âOnâ, âOffâ] â> Require
floor_keyword = [âGroundâ, âFirstâ, âSecondâ] â> Require
room_keyword =[âKitchenâ,âOfficeâ,âBathroomâ] â> Require
device_keyword =[âHeaterâ, âCoolerâ âCeilingâ, âTableâ] â> Require
Q:- How Intent Parser will handle âTemperature at Twentyâ ?
Other two phrases (2 and 3 ) looks ok, as entities are known.
Q:- How Automation Skill will get this number Twenty in json before setting office temperature?
Twenty is not a known entity, Do we need to registered another keyword with intent parser ?
temperature_keyword = [âoneâ, âtwoââŠâfifty fiveâ] â> Require / Optional
If we make temperature_keyword optional and Automation Intent Parser detect no Temperature Value in phrase, then it will definitely pass null to Skill in json for temperature value.
Suppose Automation Skill put a check on Temperature value (which is null) and ask again for Temperature,
user replied âset it at Twentyâ,
In this case do we need another Temperature Intent Parser to parse Temperature Value?
Take another example :
A Voice Calculator
âCalculate Five Thousand Divide by Twenty Fiveâ
Calculator Intent Parser
calci_keyword=[âCalculateâ, âCalciâ] ------> Require
Operation_keyword=[âMultiplyâ,âDivideâ,âPlusâ,âMinusâ] â> Require
What about âFive Thousandâ and " Twenty Five" ?
How Intent Parser will deal with Numbers ?
Stock Exchange Intent Parser
âBUY Thousand Share of APPLE Limit two hundred point fiveâ
âSELL Twenty Lot of ORACLE Marketâ
Adapt doesnât yet deal with numbers! At least, not in any helpful way. Thereâs an open task in our JIRA instance about making datetimes a first order citizen, and numbers are an obvious additional case. Right now (with the exception of the tokenizer), Adapt doesnât require any localization, and converting from phrases to numerals (and vice versa) is something that will vary from language to language. I donât have a good answer for you right now as to what the correct direction for this is.
In the short term:
You can specify a list of reasonable temperatures that are specific to your skill (twenty is ok, one hundred is death)
You can specify a regex entity that extracts numbers:
"?P(<Temperature>\d+) degrees"
Since (at the moment) this is primarily a command and control interface, the vocabulary sets for these skills shouldnât be particularly large.
Good Work, but What are the differences between Adapt.ai and Api.ai https://api.ai/ ?
Why not use and improve what already exists?
Regards,
Miguel
Aside from the technical differences, which I wouldnât likely be able to provide too much light on, just quickly having a look at the code on github, Iâm unable to find where or how they handle parsing of the requests. Do you know if their intent parser or voice request parser is open source? My first impression is that they provide SDKs for various programming languages, but their implementation isnât actually opensource.
The parser includes code that prefers wider parses (covering more of the utterance) than smaller parses. So if âtokyo weather matsuriâ is known as being a song title, it should recognize it. Unfortunately, it can not work out that this is a song title just because it appears in front of the word âsongâ and after the word âplayâ, as best I can make out.
@Raidptn Adapt is open source and not reliant on a 3rd party service. Even if https://api.ai offers free service to open source project - you are still beholden to their service being online and available for the lifetime of any project you base upon it.
This is a correct interpretation of the code. The âAdapt-yâ way of implementing this is to have an index of song titles that youâd expect to recognize. A good implementation would be to get song titles from the userâs media library. A less good implementation would be to go to freebase and get a list of all songs ever written. I do not recommend the latter; people have written a lot of songs
I tried that on my own attempt at doing what Mycroft is attempting. The trouble is that the people who provide track data are not very cooperative. One of my tracks has a title tag value of
"Gimme! Gimme! Gimme! (A Man Af"
And my library has over 3,500 tracks in it. (not all ABBA songs) So I had to add an additional table relating a special âpronounced titleâ string with each track, and populate that with algorithmically simplified title strings. I removed anything after a left-paren and converted all punctuation to space. Then I was faced with the pronounciation dictionary not having âgimmeâ in it.
Yup, these are the problems! Iâve done the same thing on past projects; music is a particularly dirty data set. The lack of gimme in your pronunciation dictionary is something that should be resolved with a high-quality dictation speech recognizer. Thereâs then the issue of using, for example, the english recognizer, and trying to recognize the names of german songs.
Long story short, this stuff is hard. But hard is fun!
A way around this is not to try to select individual songs by voice command. Instead define nicely named playlists for various purposes. âMycroft, play my christmas album.â
Another approach is to implement a âfuzzy matchâ algorithm that tries to match the text output by the recognizer against known category strings using something like a SounDex hash. This requires using specified grammars with wild cards like âPlay the song [WORDâŠ] .â or using âhamming distanceâ matching. A general âsounds most like which of theseâ match filter might be generally useful to several behaviors.
My library includes songs with titles like âć·ăźæ”ăăźăăă«â. And Japanese has phonemes that do not exist in English, and vice-versa.
Possible solution:
For this you could go through the music library and try to detect what languages are present, (language detection need be run only once, each time the music library is modified), or some other way to obtain a list of languages.
Then whenever the user asks to play a song, the spoken song title could be transcribed with all of the language recognizers in the list of languages the music library contains. The transcription with the highest confidence is then chosen, and that is the title that is searched for.
Would also be nice for persons whose first language is not English, but have English music in their musik library.
also: â99 Luftballoonsâ ftw
What about a json skill parser ?
I think we could write most skills with just a few jsons.
Jsons for entities, one for regex and some properties.
Therefore we could easily write skill without touching any python.
Edit : maybe not a good ideaâŠ
Hi, having fun with adapt.
Does it support well UTF-8 ? i tried to put a âĂ©â in a keyword and I have weird issue in json output
{
âDelayâ: â10 minutesâ,
âintent_typeâ: âtimerIntentâ,
âconfidenceâ: 0.4827586206896552,
âtimerKeywordâ: âpr\u00e9viensâ, ---- instead of âprĂ©viensâ
âtargetâ: null
}
Other question : is there a french tokenizer ? for words such as âjâajouteâ, I would like âajouteâ to be a word (a keyword actually) but it doesnât work =)
I think french tokenizer is pretty similar to the english one except for this quote rule (which has exceptions such as words like âaujourdâhuiâ).
Other question
I tried registering this regex entity :
engine.register_regex_entity("(?P<NumericValue>10)%")
But it doesnât work as it should, any idea ?