Dump a sample of data from PostgreSQL

Posted on June 5, 2016 by Dan Keder

Some time ago I needed to get a small sample of data for development and testing from a huge (production) PostgreSQL database – something like “select these 100 users and their data from other tables and dump it into a file”.

The standard pg_dump tool is not good for this. It can only dump the whole database or just some of the tables, but not a subset of rows from particular tables. After looking around the net for a while I didn’t find an easy way to do it.

So as part of my recent efforts to learn Go I created a small tool called pg_dump_sample that does that. You tell it which tables it should dump and how and it creates a dump file containing the dataset. Because the manifest file which specifies what tables should be dumped can be reused it’s a matter of running the tool again to get a fresh dump. The resulting dump file can be loaded into an empty database by psql or a similar tool. I found this approach to be a very convenient way to keep the dev and testing environments up-to-date.

And the last thing - you can find pg_dump_sample on Github, along with a bit of documentation and examples of use.