blog

Process Modeling Meets Voice Interaction

Process Modeling Meets Voice Interaction
Talking for Modeling Processes

“Alexa, turn on Netflix”.

Nowadays, anyone, even with no technical skills, can talk to an intelligent voice assistant and feel like technology is part of their lives. Assistants like Alexa are able to play music, provide information, deliver news and sports scores, tell you the weather, control your smart home and also order products online.

There are two main factors contributing to the success of companies like Amazon in our daily life. On one hand, coverage: they provide a new paradigm of interaction easily performed by anyone, no matter their age, race or religion. On the other hand, predictivity: one can anticipate the speech recognition of pre-defined sentences, and plan ahead the expected reaction. These reactions do not need to be simple; for instance, sentences like “Alexa, play Nothing else matters“, “Alexa, ask Uber to request a ride”, or “Alexa, turn on the light“, will generate very different reactions.

In the 1968’s Stanley Kubrick’s movie “2001: A Space Odyssey”, Dave interacts fluently with HAL 9000, and the computer is able to reason and react accordingly (sometimes in a way that it does not follow Dave’s intentions). I remember the first time I watched Kubrick’s movie – it was mid nineties – , I was sure that the technology in the future would be like this. But so far, it seems current human-machine voice interaction technology has not reached Kubrick’s expectations yet.

There is still a long way to go before having a computer with fully gadgetless interaction capabilities like HAL 9000. Nevertheless, it is time for enabling voice interaction beyond leisure applications. At Process Talks, we have developed a process modeling platform that, in addition to other fresh functionalities, allows the modeler to model just by speaking their intentions to our voice assistant. See this video where we show how easy it is to use voice interaction in Process Talks:

Speaking to Process Talks is as simple as sending a voice note in chat applications like Whatsapp. Just press and hold the microphone button, and speak in one of our supported languages: All without leaving the same collaborative modeling session! Your voice commands then safely travel encrypted through the network, and are handed to our reasoner engine, who decides the best way to fit them into your process model.

Understanding human language via voice brings lots of new challenges when compared to text. At Process Talks, we are still working on improving our speech recognition. One of the biggest challenges for speech recognition is capturing open input, i.e., speech input with undefined fragments where users can say anything they think of. In process modeling, the clearest example of open input are the activity names: Process modeling is a widespread practice, which makes it very difficult to anticipate the kind of language people will use when describing the activities. This is why we currently sort out this problem by simply assuming that elements like activities have been added previously. Voice interaction then uses their identifying number: A predictable voice snippet that our speech recognizer can easily understand. Notice how, for Alexa, utterances like “Nothing else matters” are not really open input, as the speech recognition technology has been pre-trained to understand this popular three word sequence, corresponding to this amazing song by Metallica. Still, even just Metallica already has around 151 songs, so how can Alexa understand all of them? The trick is simple: One would expect only 5-10 Metallica songs are the ones people remember, so speech recognition training is only done for a selection of the most popular songs.

We plan to explore how voice interaction can augment modelers when modeling alone, but especially in collaborative sessions arising from a discovery workshop. Currently we support voice commands in English and Spanish, but depending on its adoption and demand, we may incorporate other languages as well.

A complete, professional implementation of voice modeling for BPMN 2.0 can have unprecedented adopters. For instance, with such a technology, people with visual or motor impairments may for the first time be able to create BPMN diagrams smoothly. The opportunities are countless.

Dare to try? Please schedule a demo with us at this link.

Photo by Stephen Harlan on Unsplash

Facebook
Twitter
LinkedIn