Apple’s Siri team has published a new Machine Learning Journal entry that details some of the process behind making voice-activated ‘Hey Siri’ work with just our voice. Apple previously documented part of the process behind pulling off voice-activated Siri in general last fall, and the first Machine Learning Journal entry of this year focuses on the challenge of speaker recognition.

As referenced in the previous entry, Apple says the phrase ‘Hey Siri’ was chosen in part because a number of users were already using it naturally when activating Siri with a hardware button.

The new entry describes three challenges with activating Siri by voice: the main user saying a similar phrase to Hey Siri, another user saying Hey Siri, or another user saying a similar phrase to Hey Siri.

By limiting activation to the main user’s voice, the design ideally prevents two out of those three issues. The entry touches on the surface of how Apple approaches that problem:

As with each Machine Learning Journal entry, the piece then takes a relatively detailed look at Apple’s implementation before touching on the unsolved problems with the feature: using Hey Siri in a noisy environment or large room.

Voice-activated Siri started with the iPhone 6 as the piece notes, although the original version only worked when the device was charging. Today Hey Siri works on new iPhones, iPads, and Apple Watches without charging, and it’s the primary controller for HomePod. In the future, the same Hey Siri feature may be how we interact with AirPods as well.

The full entry — which is based on research submitted for the International Conference on Acoustics, Speech, and Signal Processing — offers a rare close look at the amount of thinking behind a feature that hopefully feels natural to the user.

Related Stories:

  • iOS 9 includes ‘Hey Siri’ voice training to help Siri better recognize your voice
  • With iOS 10, ‘Hey Siri’ intelligently activates on just one nearby device at a time [Video]
  • Apple explains how ‘Hey Siri’ works using a Deep Neural Network and machine learning