TRUNCATE vs DELETE: Efficiently Clearing Data from a Postgres Table

A common pattern in automated testing is to clear the database state before every test runs. This ensures that the tests remain robust and immune to side-effects of adjacent tests. While a clean database state for every test is beneficial in this sense, the overhead of clearing data can be significant. In one of our largest test suites comprised of thousands of tests interacting with Postgres, we reduced the run time of the suite by over 33% by switching our table-clearing SQL to use DELETE rather than TRUNCATE.

The Overhead

Our team’s productivity depends on the performance of this test suite, since they run very frequently and are an integral part of the deployment pipeline. The nearly 3,200 tests included are responsible for the correctness of one of our most critical codebases. The codebase has constant activity, and changes are continuously committed by authors from multiple teams. The entire suite runs before every commit is merged, as well as after each commit lands in master.

We were surprised when we measured the overhead of clearing table data. Before each test runs, no fewer than 32 tables are wiped clean with a TRUNCATE. The TRUNCATEs accounted for 20-200 milliseconds of overhead for each test. In other words, this added 1 to 5 minutes of overhead for the entire suite when run serially. (In practice, we run groups of tests in parallel, so the perceived total overhead was closer to about 1 minute total.)

When we swapped the TRUNCATEs for DELETEs, the overhead of each test dropped to 1-3 milliseconds. This reduced the combined overhead of all the tests to a total of about 10 seconds. In the graph below, you can see that average test suite run time dropped by over 33%—from over 3 minutes to about 2 minutes.

Test suite average daily run time. The vertical red line represents the day on which the change from TRUNCATE to DELETE was merged.

The Difference Between TRUNCATE and DELETE

We were excited about the substantial boost to our test performance, but curious about why the switch from TRUNCATE to DELETE helped. After all, the result of TRUNCATE table_a; or DELETE FROM table_a; is the same: an empty table_a. So why would one be any faster than the other?

In actuality, TRUNCATE and DELETE in Postgres do different things under the hood. From a database user perspective, the results seem the same: data is deleted; however, internally, there are major differences in the underlying data representing the table. With DELETEs, dead rows remain in database pages and their dead pointers are still present in indices. TRUNCATE, on the other hand, is a bit more thorough in wiping the slate clean. The resulting table looks almost identical internally to a newly CREATEed table. There are no dead rows, the indices are empty, and the table statistics are reset. There are quite a few other differences are well, all highlighted by a terrifically detailed post by Craig Ringer.

Also, TRUNCATE is a constant time operation while DELETE complexity depends on the number of rows in the table. TRUNCATE is a fast operation to quickly clear out huge tables, but DELETE is much faster when a table contains just a handful of rows. This is why our tests saw such a significant speed improvement by the change: each test only inserts a handful of rows.

Conclusion

With the large size of this test suite and the heavy traffic from our engineers, a reduction in the run time of the tests has a big impact on developer happiness and productivity. This tweak is perhaps most satisfying because of the combination of significant test speedup and it’s relative simplicity. The investigation also offers interesting insights into Postgres internals and a take-away to use DELETE on very small tables and TRUNCATE on everything else.

If you’re interested in working with Postgres to Make the World Programmable, you’re in luck! We’re hiring!