Mycroft Community Forum

Deleted data is the most secure

Originally published at: http://mycroft.ai/blog/deleted-data-is-the-most-secure/

Last week, our team became aware that utterances from Mycroft users were being retained for longer than intended. They have since been deleted and our team has conducted a review into the circumstances.

While evaluating the voice queries processed by our servers, our team came across a discrepancy in the numbers being reported by the database compared to the number of files stored on the server. We discovered a bug that was preventing the deletion of audio queries and their transcriptions from users who have not opted to share their data.

The team immediately identified and patched the system to ensure no further queries would be retained unless a user has explicitly opted-in to share their data with us. All files from users who had not opted-in were then securely deleted, and all backups were double-checked to ensure that any copies of these files had been purged.

All of these files have now been permanently deleted and we have maintained no records of their contents. We have also verified that none of these files had been accessed since they were created. That includes access by any Mycroft employee or automated process.

To mitigate the chance of an issue like this re-occurring in the future, the team is working on additional procedures to independently verify that any data being retained has the explicit permission of the user. This will make reviews of our data systems quicker and more robust.

Tens of thousands of people trust us with their data, and we take that very seriously. We will always be open and transparent about any scenario that might diminish this trust. Whilst there has been no access of this data, there is an expectation that your data will not be retained for any period of time that isn’t necessary to complete the request. This is why our standard practice is to delete any data once it has fulfilled the purpose for which it was collected. In the case of voice queries, this is immediately after that request is completed.

We are committed to the security and privacy of all our user’s data, and the most secure data is the data that you don’t have. No software system is perfect, but we set ourselves a high standard and will continue to seek out and fix any potential concerns.

8 Likes