In the last four posts of our series on writing Postgres Extensions, we got the basics covered types and operators, introduced a debugger and completed the test suite.
Now let’s add another type and see how we can organize the code base when it grows.
You can find the last post’s code base on the github branch part_iv Today’s changes can be followed on branch part_v
Versioning
We might be happy with our Extension and use it in production for a while without any issues. Now that our business succeed, the range for integer
might no longer be enough. That means we’ll need another bigint
based type bigbase36
, which can have up to 13 characters.
The problem here is that we can’t simply drop the extension and re-install the new version.
1 2 3 4 |
|
If we DROP ... CASCADE
here, all our data would be lost. Also, dumping and recreating is not an option for a terabyte-sized database. What we want is to ALTER EXTENSION UPDATE TO '0.0.2'
. Luckily, Postgres has Versioning for Extensions built in. Remember in the base36.control
file we defined:
1 2 3 4 |
|
Version ‘0.0.1’ is the default Version used when we execute CREATE EXTENSION base36
, leading to the import of the base36--0.0.1.sql
script file. Let’s create another one:
1
|
|
And default to this one:
1 2 3 4 |
|
And see if it builds:
1
|
|
Getting
1 2 3 4 |
|
Hmmm, it wants to use extension/base36--0.0.2.sql
but can’t find it.
Let’s fix the Makefile and tell Postgres to use all files following the pattern *--*.sql
.
1 2 |
|
In base36--0.0.2.sql
we can now add the bigbase36
type
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
|
As you can see, this is mostly a find and replace for base36
to bigbase36
and int4
to int8
.
Lets add the C-Part.
Organizing C-Code
To have the C-Code better organized we’ll put base36.c
under the src
dircetory.
1 2 |
|
Now we can add another file for the bigbase36
input and output function in src
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
|
It’s more or less the same code as for base36. In bigbase36_in
, we don’t need the overflow safe typecast to int32
anymore and can return the result directly with PG_RETURN_INT64(result);
. For bigbase36_out
, we expand the buffer to 14 characters as the result could be that long.
To be able to compile the two files into one shared-library object we need to adapt the Makefile as well.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
Here (Line 13) we define that all src/*.c files will become object files that should be build int one shared library from these multiple objects (Line 15).
Thus, we have again generalized the Makefile
for future use.
If we now build and test the extension then all should be fine.
However, we should also add tests for the bigbase36
type.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
If we take a look at results/bigbase36_io.out
we see again some odd behavior for too-big values.
1 2 3 4 5 6 7 |
|
You’ll notice strtol()
returns LONG_MAX
if the result overflows. If you take a look how converting text to numbers is done in the postgres source code, you can see that there are lots of platform-specific edge and corner cases. For simplicity, let’s assume that we are on a 64 bit environment having 64 bit long results. On 32 bit machines our test suite and thus make installcheck
would fail, telling our users that the Extension would not work as expected.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
|
Here, by including <limits.h>
we can check if the result overflowed. The same can be applied for base36_in
checking result < INT_MIN || result > INT_MAX
and thus getting ride of the DirectFunctionCall1(int84,result)
. The only caveat here is that we can’t cast LONG_MAX
and LONG_MIN
to base36.
Now that we’ve created a bunch of code duplication, let’s improve the readability with a common header file and define the errors in macros.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
|
Also, there is no good reason why we should disallow negative values.
Migrations
Finally our new Version is ready to be released! Let’s add an update test.
1 2 3 4 5 |
|
After we run:
1
|
|
We see:
1 2 3 4 5 6 7 |
|
Although Version 0.0.2 exists we can’t run the Update command. To make that work we’d need an updated script in the form extension--oldversion--newversion.sql
that includes all commands needed to upgrade from one version to the other.
So we need to copy all base36 realted sql into base36--0.0.1--0.0.2.sql
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
MODULE_PATHNAME
For each SQL function that uses a C-Function defined AS '$libdir/base36'
, we are telling Postgres which shared library to use. If we renamed the shared library we’d need to rewrite all the SQL functions. We can do better:
1 2 3 4 5 |
|
Here we define the module_pathname
to point to '$libdir/base36'
and thus we can define our SQL Functions like this
1 2 3 4 |
|
Summary
In the last five articles you saw that you can define your own datatypes and completely specify the behavior you want. However, with great power comes great responsibility. Not only can you confuse users with unexpected results, you can also completely break the server and loose data. Luckily you learned how to debug things and how to write proper tests.
Before you start implementing things, you should first take a look on how Postgres does it and try to reuse as much functionality as you can. So not only do you avoid reinventing the wheel, but you also have trusted code from the well-tested PostgreSQL code base. When you’re done, make sure to always think about the edge cases, write down everything into tests to prevent breaking things, and to try out higher workloads and complex statements to avoid finding bugs in production later.
As testing is so important, we at adjust wrote our own testing tool called pg_spec
. We’ll cover this in out next post.