What does Amazon know about its users thanks to its “smart” language software Alexa and how does the company deal with the many recorded language files? As the computer magazine c’t reveals in its latest issue, the US company recorded sound files from a user’s bathroom and bedroom and sent them to a stranger. Amazon says it was a mistake.
c’t writes that a user has requested a self-assessment from Amazon in accordance with the new EU General Data Protection Regulation (GDPR). Two months after the request at the beginning of August, he received it in the form of a download link to a 100 MB zip file. However, the data set not only contained his tracked search history, but also around 1700 WAV files, i.e. voice recordings, and a PDF with transcripts, which Alexa apparently understood from these voice files. However, according to the user, he had no Alexa-enabled device at home.
Commands to shower thermostat and alarm clock recorded
The user received no reply to a message to Amazon that he had apparently received voice messages from a strange user. However, the download link for self-assessment stopped working a short time later. The user who had saved the data then contacted the editors at c’t. They managed to locate the Alexa user whose voice messages Amazon had sent to the unauthorized person.
For example, the software had recorded commands to the shower thermostat and alarm clock as well as timetable queries and with them places, first names and, in one case, a last name. The editors then contacted the person concerned. “He swallowed audibly when we told him which of his highly private data Amazon.de had passed on to a stranger,” writes the magazine.
c’t confronted Amazon with the case, but without mentioning that those affected had been identified and contacted. The magazine wanted to know if Amazon had reported the glitch to those affected and to data protection authorities, as it should. Amazon did not respond to the questions, the company referred to a “human error” and claimed to have resolved the problem with both customers. We are in the process of improving the relevant processes. Both of those affected received a call from Amazon employees three days after the c’t request: The glitch happened because both had made a GDPR query at about the same time.
Amazon stores data to train its AI
In fact, Amazon explains in its data protection FAQ that the voice recordings are not deleted promptly, but are stored in order to continuously improve Amazon’s artificial intelligence. However, the customer can review the recordings and delete them individually or all at once. However, the question is who knows and uses this option at amazon.de/alexaprivacy.
In addition, the question remains whether this data breach was actually a one-off and whether the data protection authorities are now taking sanctions against Amazon. Because companies are legally obliged to report data breaches to a supervisory authority within 72 hours. If these requirements are violated, the data protection authorities can impose a fine. According to the rules of the GDPR, it can amount to up to four percent of the company’s annual turnover. With annual sales of $177.9 billion in 2017, that would probably be billions.