Protecting privacy with data redaction

Share this post

Improving user data redaction at Lob

User data redaction involves obfuscating sensitive user data. In Lob’s case, it includes, but isn't limited to first name, last name, and email address.

At Lob, we have a way to programmatically handle redaction for resources like letters and postcards but our redaction system needed updating to include user data. We had a manual process to redact user data by querying the production users table to perform redactions but to scale, it needed to be automated. In this post, we'll walk through how we implemented our solution.

Diving into the problem

While scoping this project, I reviewed our redaction codebase to determine if rewriting it to include user data made sense. To not disrupt existing functionality, I opted to create an endpoint designed for user data redactions and make it work within the current redaction system.

One of my favorite parts of being an engineer is putting on my detective cap and digging into a problem. Working on the redactions project necessitated the use of a magnifying glass.

The solution I landed on was to store user data redactions in a database table. By making post requests to a new redactions endpoint Lob can store information about the user and have a worker run in the background checking daily for new users to redact. The worker does the following:

Selects new records with the status 'created' from the redactions table
Assigns the record a unique id
Updates the status of the record from 'created' to 'ongoing'
Calls an AWS step function, which retrieves the necessary information and fields
The AWS step function makes a POST request to the {resource}/redact endpoint to handles the actual redaction
Updates the status of the record to success or failed

In addition to the endpoint, I created two migration scripts. The first script updated constraints in the redactions table and allows users to be a redactable resource. The second script added a ‘date_redacted’ and ‘redaction_id’ property to the users table and updated the model and validators so users could be treated as a redactable resource.

After my initial investigation, I proceeded to define a user redaction endpoint. Testing the new endpoint locally worked fine, but an issue surfaced when I deployed it to our staging environment. I discovered I needed to enable the step function to access the new redaction endpoint.

Lob uses a node-wrapper to make HTTP requests. I had to publish a new version of this wrapper to include the new user redaction capabilities. After writing many more tests, I deployed it to staging and everything worked as expected.

One of Lob’s core values is Own the Outcome. To help future Lob developers, I updated our internal documentation with implementation details on user data redaction and how to extend Lob’s redaction system to support future use cases.

Takeaways

This project was a success for several reasons:

A solution was implemented to automate the problem of redacting user data
The project contributed to making the redactions process easier
The solution has been used to redact user data for customers
Not only did I deliver an important new feature, but I gained experience with Docker, AWS LocalStack, AWS Lambda, AWS Step Functions, and Postgres.

I don’t think an intern can ask for anything more from an internship. Thanks, Lob.

This blog provides general information and discussion about direct mail marketing and related subjects. The content provided in this blog ("Content”), should not be construed as and is not intended to constitute financial, legal or tax advice. You should seek the advice of professionals prior to acting upon any information contained in the Content. All Content is provided strictly “as is” and we make no warranty or representation of any kind regarding the Content.

Protecting privacy with data redaction

Improving user data redaction at Lob

Diving into the problem

Takeaways

Continue Reading

Everything You Need to Know About the USPS Integrated Technology Promotion

What are the hidden costs of manual mail?

How Lob's Machine Learning Models Aim to Cut Cost and Waste in Direct Mail

Lob's website experience is not optimized for Internet Explorer. Please choose another browser.

Protecting privacy with data redaction

Improving user data redaction at Lob

Diving into the problem

Takeaways

Continue Reading

Everything You Need to Know About the USPS Integrated Technology Promotion

What are the hidden costs of manual mail?

How Lob's Machine Learning Models Aim to Cut Cost and Waste in Direct Mail

Lob's website experience is not optimized for Internet Explorer.
Please choose another browser.