If you’re thinking about using a key value store for your technology stack, you shouldn’t just consider all the new kids on the block.
Postgres has a some very cool tricks up its sleeve. It has its own integrated key value data type.
And today we’ll go through how to write your own aggregations for postgres hstore.
The use case
At adeven our technology stack is heavily based on postgres. Our latest product adjust.io tracks
downloads, clicks, impressions and many more. To see the distribution of the parameters we track, we need to implement our own aggregations.
Let’s say you have a simple test table which stores creation time along with country:
CREATETABLEtest(countrycharactervarying(12),created_attimestampwithout time zone);INSERTINTOtest(created_at,country)SELECTtime,(Array['de','en','fr','us','it'])[(random()*4+1)::int]FROMgenerate_series('2012-10-20 00:00'::timestamp,'2012-10-20 23:59','1 second')time;CREATEEXTENSIONIFNOTEXISTShstore;
If you want to count the number of times different countries are represented in the dataset, you can do it using the postgres hstore function.
Let’s say you want to store pre-aggregated results in some other table.
CREATETABLEtest_by_hour(countrieshstore,created_attimestampwithout time zone);INSERTINTOtest_by_hour(countries,created_at)SELECTarray_count(array_agg(country)),date_trunc('hour',created_at)AShourFROMtestGROUPBYdate_trunc('hour',created_at);
To get the distribution of the whole table you now need to aggregate (sum) hstore documents.
Luckily you can define your own aggregations in postgres; to do so, you need a function that knows how to add two hstores.