1. Voice User Interface
- In hindsight, A wake word(trigger) followed by a command that
our app will execute and control video player.
-
According to a
report, 1 in 4 adults in the US owns a smart speaker today. And more
than half of the smart speaker owners in the US are using their
devices on a daily basis.
-
71% of users
prefer doing a voice search in queries instead of typing.
-
So we can assume people are comfortable using voice UI.
-
But there are some disadvantages of voice UI like Privacy concerns
and Misinterpretation.
1) Solving Privacy issues first
I've done research a bit on how voice command works
technically and found offline(without cloud) option where we
don't need to post microphone data to servers instead process
it inside the user's phone with the use of
Porcupine Wake Word
engine and the
Rhino Speech-to-Intent.
e.g. - "Vangio, play"
- In this phrase,
Porcupine identifies the wake word “Vangio” and Rhino infers the
intent of the command that follows. Rhino uses an embedded grammar
to determine the meaning of the command.
- User can set
wake word and commands for play, pause, #number of seconds/minute
before/after.
2) Secondly Misinterpretation can be
tackled by two ways: first app would not allowed customize
commands which spells the same(homonyms like their and there, ad
and add, etc.) and secondly if misinterpretation happened then app
will give error voice feedback to user.