Intent parsing - word after the wake word

mike99mac · June 15, 2024, 5:18pm

I’m battling with intent parsing. Until recently, my focus has been on music requests. The fallback has been “question answering” skills by Wikipedia, Wolfram, etc., but often, that’s not what the user requested, then “barge in” is needed to stop the answer. Now I’m working on IT requests and am having difficulty getting the new skill called when it’s needed.

Could we train the user the importance of the word immediately following the wake word? For example:

  <wake word> music <music request>
  <wake word> question <general question>
  <wake word> IT <IT request>

I believe users would be OK with that, and they would get more consistent results. Thoughts?

j1nx · June 15, 2024, 6:48pm

I don’t really understand the problem that you described but i do understand the solution and think it is very usefull indeed.

Let the assistant know what is coming and with that give priority to certain type of skills or hence even disable certain skills.

I can imagine asking a question about music. You just don’t want the OCP win the intent. You want a spoken answer. Not music.

mike99mac · June 15, 2024, 7:20pm

Here’s an example. I ask “Computer … how many of my servers have 4 CPUs”. If there is a CMDB of all servers in the organization, an answer should be forthcoming. My problem is getting the intent parser to call the new IT skill.

Another example is “Computer … play track hey jude by artist the beatles”, but there’s no hit on the music playing skill and the fallback is question answering. The user, who wanted music, is not happy about the history of “Hey Jude”.

I’ve been mainly focusing on Minimy as I patiently wait for the new ovos-media architecture. Minimy does not have the voting algorithm, but it shouldn’t be too hard to prototype it to hinge on the word immediately following the wake word.

Watch this space …

JarbasAl · June 15, 2024, 7:55pm

I ask “Computer … how many of my servers have 4 CPUs”. If there is a CMDB of all servers in the organization, an answer should be forthcoming. My problem is getting the intent parser to call the new IT skill.

if you share your skill we can have a look and help out

Another example is “Computer … play track hey jude by artist the beatles”, but there’s no hit on the music playing skill and the fallback is question answering.

that sounds like a lack of proper music skills to find results, as common_qa happens after the OCP queries. if no skill reports it can handle your query there is nothing OVOS can do, ovos-core only identifies a query as wanting to search your installed music skills

are you on latest ovos-core too? “play XXX” should either play something or speak an error, not go to common_qa

The user, who wanted music, is not happy about the history of “Hey Jude”.

but it’s also weird common_qa is trying to answer that questions as it should ignore any utterances without “what”/“how”/“when”/“who”/“how” , and “play XXX” doesnt have any of that.

if you can share details on the specific utterances and installed skills it would help to figure out of there’s an issue or not

JarbasAl · June 15, 2024, 8:06pm

but it’s also weird common_qa is trying to answer that questions as it should ignore any utterances without “what”/“how”/“when”/“who”/“how” , and “play XXX” doesnt have any of that.

I actually just gave you wrong info on that last paragraph, here is a PR making what i said above true!

mike99mac · June 16, 2024, 10:57am

@JarbasAl - thanks for a thorough reply. But this code is Minimy, not OVOS nor Neon: GitHub - mike99mac/minimy-mike99mac: simple nlp based voice assistant framework

mike99mac · June 16, 2024, 11:01am

OK, I will try a fresh build (a new RasPi 5 arrived last week :))

NeonDaniel · June 17, 2024, 5:04pm

mike99mac:

Could we train the user the importance of the word immediately following the wake word? For example:
  <wake word> music <music request>
  <wake word> question <general question>
  <wake word> IT <IT request>
I believe users would be OK with that, and they would get more consistent results. Thoughts?

I think that could be an interesting option in the intent pipeline… Something like a hint for the intent service to know what service or skill the user is trying to ask about. It reminds me of Google Assistant (at least as it was when I used it years ago) where you would ask “Okay Google, ask to ” or “Okay Google, talk to ”…

A skill that implements this would basically need to:

Always respond since there will be no fallback handling
Define keywords it triggers on
Define some priority to handle conflicts, similar to CommonPlay and CommonQuery
Define some standard method for disabling this behavior, either per-skill, globally, or both