Amazon Transcribe can now automatically redact personally identifiable information

Amazon Transcribe, the AWS-based speech-to-text service, launched a small but important new feature this morning that, if implemented correctly, can automatically hide your personally identifiable information from call transcripts.

One of the most popular use cases for Transcribe is to create a record of customer calls. Almost by default, that involves exchanging information like your name, address or a credit card number. In my experience, some call centers stop the recording when you’re about to exchange credit card numbers, for example, but that’s not always the case.

With this new feature, Transcribe can automatically identify information like a Social Security number, credit card number, bank account number, name, email address, phone number and mailing address and redact that. The tool automatically replaces this information with ‘[PII]’ in the transcript.

There are, of course, other tools that can remove PII from existing documents. Often, though, these are focused on data loss prevention tools and aim to keep data from leaking out of the company when you share documents with outsiders. With this new Transcribe tool, at least some of this data will never be available for sharing (unless, of course, you keep a copy of the audio).

In total, Transcribe currently supports 31 languages. Of those, it can transcribe six in real time for captioning and other use cases.