Alexa (also) whispers
I’m that kind of person who needs some background radio noise (typically some sports Radio program) to get asleep, and so I have an alarm clock that I can program to switch off after X minutes, when hopefully I am already asleep. One month ago, however, my alarm clock radio stopped working, and I found myself buying and returning several alarm clock devices of different brands. Why ? just because the lowest sound level for most of those devices was so high that I wasn’t able to actually get asleep at all (nor my wife, who typically uses ear plugs to avoid the sound).
So after my last failed attempt, I gave Alexa a try: I took the Echo Dot that I have in our living room, and moved it to my bedroom. I programmed the alarm clock to wake me up in the morning, and started using the alarm clock radio at night with my favourite sports program (well, favourite is not the word; simply the one that is so boring that I easily get asleept). After launching it, I simply need to say “Alexa, in 20 minutes stop the program”.
Since our bedroom is next to our son’s bedroom, the first night I passed the instruction to Alexa in a whisper. Then I got really shocked by Alexa’s reaction, who also whispered its answer back to me. It was a bit creepy at the beginning (just try it!), but once I got used I actually loved it, and now whispering commands to Alexa has become my before-bed routine.
To me this whispering capability is quite interesting: how is the Automatic Speech Recognition (ASR) adapted for this input ? is it a simple adaptation of the models learnt for normal speech, or instead you have to retrain all the ASR stack on purpose for this whispered input? and if a training is necessary, what prompted the product designers at AWS to make the decision to incorporate this communication modality, which doesn’t seems to be really crucial (you can simply lower Alexa’s volume to communicate the effect that the user wants a less noisy interaction).
But actually, what really amazes me is the fact that Alexa technology has hidden capabilities that pop up whenever the user feels like they should be there. What made me to actually talk to Alexa in a whisper? I believe it was the trust that I have in the technology behind it to be able to understand a whisper. And now, the trust has only grown, due to the surprise of being responded with a whispered output.
Building technology to be used by people is not an easy task at all; you are moving between the user needs (product market fit) and whatever comes to your mind that you believe it is a nice feature – which then, if you are a team like ours at Process Talksthat comes from academia, is a huge risk. The product market fit is really a necessary element of any viable business, since otherwise you will be creating software that nobody wants, and therefore, nobody buys.
In parallel to developing technology guided by the product market fit, any software company needs to build trust on their technology. I would not say it is a secondary dimension, less important that product market fit. Instead, I believe it is a long term goal that should be considered from the early stages and cannot be relegated to a later stage. Because when the product is ready, other metrics (like churn) will manifest as very important, and the trust you build in the software-users relationship will definitely contribute positively on them.
At present, it will be hard for me to not whisper anymore to any of my future alarm clock radios …