What’s the matter with Alexa? When it comes to the age-in-place elderly, nothing that a resetting of expectations can’t fix.
To be clear, we like Alexa. In fact, our next release of our firmware will include Alexa integration. However, it seems like every week we will run into someone who says “Isn’t Alexa going to do everything that ONKÖL does?” Ummm…..no. And when we ask that person to be specific about what it is they think Alexa will do for them, they can’t really answer. They just know that voice-first technologies are all the rage right now, and are a bit of a darling in the elderly segment. From where we sit, when applied to age-in-place elderly, Alexa currently sits on the “Peak of Inflated Expectations” of the Gartner Hype Cycle.
During our field testing of Alexa integration, we found several issues, such as imperfect speech recognition. But the one issue that particularly applies to the elderly segment is what we call the "Invocation Word" problem. This requires a bit of background, so bear with me.
In its current form, Alexa can operate in one of three modes. First, there is what we refer to as "built-in" mode, whereby after saying “Alexa” , the smart speaker becomes a search engine performing the same functions as a Google search. These are functions that work the moment you plug Alexa in and link it to your Amazon account. "Alexa, what time is it?" would be an example. Alexa knows how to respond to that, and has a pretty broad range of options in terms of the question asked. For example, you can say "Alexa, can you tell me what time it is?" and it will generate the same result. Not surprisingly, Alexa is also pretty good at placing orders from Amazon in this mode.
Then there is "API" mode, which is based on functions that Amazon provides that third-party vendors can leverage. The most popular is the Smart Home Skill API, where a vendor can write some custom code (what Amazon calls a "Skill") that allows Alexa to control the vendor's smart home devices. If you are a thermostat manufacturer, you can write a Skill so that when a clients says "Alexa, set temperature to 72", it will automatically connect to your control environment and make the change. There is a similar API for Home Entertainment like cable boxes and streaming services.
While the API’s have the advantage of requiring only the phrase “Alexa” followed by the command, each API is limited in what phrases it understands, and the device or service it controls, both of which are determined by Amazon. For example, if I say “Alexa, set the air to 72”, it responds “Sorry, I don’t know how to do that” because Amazon doesn’t yet recognize “set the air” as meaning “set the temperature.” If I said "Alexa, turn off my gas valve" it will respond similarly since gas valves are not part of the Home Automation API. During our testing, it seemed that Alexa had a higher chance of getting confused or stalling out when in this mode. For example, sometimes asking Alexa to change the channel on our test Fios box worked, and sometimes it didn't. That said, Amazon continuously adds new phrases (known as “utterances”) and devices that the API’s accept, so this will get better over time.
In terms of expectations, the real problem of API mode is that folks think that everything attached to Alexa will work that way. Specific to us, we regularly run into clients who say “I can control my Hue lightbulb simply by saying ‘Alexa, turn off the living room light.’ Won’t I be able to do the same things for the healthcare functions that ONKÖL currently does?” The answer to that is “No”, as Alexa literally has no idea what you are talking about. In order to enable such functionality, you have to look to Alexa’s third mode of operation.
The third option is for a vendor to create a free-standing Custom Skill that is specific to their needs and that doesn’t rely on the Amazon-provided API’s. These Skills have the advantage that the utterances that Alexa understands can be customized by the vendor. Unfortunately, the only way for Alexa to invoke that skill is for the user to say a specific "invocation word" to kick it off. Something like "Alexa, ask ESPN to....." or "Alexa, tell MedApp to..." will run the ESPN or MedApp Skill, respectively. The vendor then has to anticipate all of the potential utterances or requests that they will accept, which by their very nature are initially incomplete.
So instead of the command being “Alexa [flexible command]”, you have “Alexa, [Invocation Word], [specific utterance anticipated by the vendor]”. The combination of the required Invocation Word and the need to match specific utterances can lead to problems, particularly for clients with working memory issues. In addition — and this can be a real issue — there are certain utterances and invocations that Amazon reserves specifically for itself, so they can’t be used in Custom Skills by third-party vendors. We ran into this problem first-hand, where there were certain utterances we wanted to include in our Custom Skill, but couldn't because Amazon had already placed a claim on them for other uses.
The net result is the current Alexa use cases for age-in-place elderly are narrower than people imagine. This doesn’t mean that Alexa isn’t useful. It is. And frankly, we are darn excited about the specific integrations we will be providing using that platform. That said, we think it’s important for folks to set their expectations accordingly and not fall for the hype. Alexa simply isn’t going to be the solution to everything. And it certainly doesn't slice bread...yet.
In a future post, I’ll talk about how Alexa can work for Personal Emergencies Response Services (PERS), and why it’s highly unlikely Amazon will launch such a solution in the near-term.