Roleman Part 1: Why We Created Roleman

Recently, I wrote a PL/PGSQL extension which provides some basic functions for creating and altering roles, and managing permissions. The extension was built to improve our tooling for PostgreSQL user creation internally. Since this has large number of external applications, it has been released to the public under the PostgreSQL license.

This blog post is the first in a series on this extension. In it I cover the difficulties which come with creating tooling around utility statements in PostgreSQL as a whole, why centralising this in user defined functions is a good idea, and what kinds of problems we are trying to solve.

In the next article in this series, we will cover the major implementation details. In that post, we will discuss how we prevent SQL injection from occurring within user-defined functions, both in terms of language injection and object injection. We will also areas of development in this area which have, for now, not been included in the extension and the security problems they pose. In the final article in the series, we will discuss the unique testing needs of such a security critical piece of infrastructure, the tooling available, and the difficulties we ran into in trying to ensure that the tests run consistently on various versions of PostgreSQL.

Word of Warning

This blog post includes sample ways of doing things which are wrong but are included in order to communicate problems that happen. Please resist the urge to copy and paste, and instead make sure you understand what you are doing. Things like placeholders may be handled in different ways depending on different database drivers, for example.

Role Management and DDL in PostgreSQL

PostgreSQL has supported the standard database role-based permissions model since PostgreSQL 8.1. In this model we think about granting access to roles, and also granting one role to another. Depending on how roles are defined, they may pass on permissions to child roles automatically or not. Managing the permissions given to various roles is an important part of securing a PostgreSQL database.

All permissions are managed by a part of the SQL language known as DDL or “Data Definition Language.” A typical set of role management statements might look like:

Role Creation DDL
1
2
3
4
5
6
CREATE ROLE read_only WITH INHERIT NOLOGIN;
GRANT CONNECT ON DATABASE mydatabase TO read_only;
GRANT SELECT ON ALL TABLES IN SCHEMA PUBLIC TO read_only;
CREATE ROLE joe_analyst WITH NOINHERIT LOGIN PASSWORD 'somethingsecret';
GRANT read_only TO joe_analyist;
GRANT SELECT, INSERT, UPDATE ON analytics.metrics TO joe_analyst;

Now, on the surface these look like they pose no problems for automated tooling, but how do we ensure that a schema, table, or role name is handled in a correct way and is not a vector for SQL injection or other bad things?

A naive approach might be to try to use placeholders where you want to supply input but this doesn’t work for a couple of reasons. The first is that placeholders are intended for literal values only and typically we want to interpolate identifiers and SQL key words.

If the database driver sends the data separately from the query then the parse tree will be invalid. If the client interpolates client-side, the escaping will be incorrect because of the complexities in rules for escaping portions of the query. SQL identifiers have an escaping syntax that is related to but different from the escaping of string literals, and you cannot escape keywords.

This Doesn’t Work:
1
2
3
4
5
CREATE ROLE ? WITH ? ?;

-- this does not work either:

GRANT ? to ?;

But, you may say, there is one case where you have a string literal (the password). Unfortunately that does not always work but for a different reason: in PostgreSQL, to have a parameterised query you have to put it through the whole planning pipeline. Utility statements, unfortunately have no plan attached.

Whether this works or not depends on your driver
1
CREATE ROLE joe_analyst WITH INHERIT LOGIN PASSWORD ?;

Now if this is interpolated on the client, then things are handled properly, but if interpreted on the server, you will get a syntax error.

Now, trying to parameterise these things is something that comes up periodically, in various forums. People often do want to create roles from the application. Sometimes this is because of a desire to create database roles for application uses to let the database enforce security, and sometimes (as here) it is to try to improve tooling for setting up the database users across a series of servers.

SQL Injection via Tooling

The only way you can run DDL statements in PostgreSQL is via string concatenation. This opens up the issue of SQL injection in the tooling used to create and manage roles. This is particularly true if you have an automatic job that looks for new tables and ensures ownership is correct. For example if a malicious user created a table or function with a problematic name, it might be possible to inject sql into the script. For example:

Attacking the tools
1
2
3
CREATE TABLE "users SET schema public; alter role chris with superuser; --"(
    LIKE users
);

In a case like this, a naive script might run and try to assign ownership to Postgres by using simple string interpolation:

Interpolate the table name, and lo and behold I am superuser!
1
2
3
ALTER TABLE users SET schema public; alter role chris with superuser; -- owner to postgres;

-- this comes from something like "alter table $tablename owner to postgres"

If I then drop the table after becoming superuser, maybe nobody ever notices…

The same trick can be done even if naive escaping of the identifier is done. In other words, it is no different, really, than any other sort of SQL injection except that most of the tools we have to combat the problem are not of any use.

PostgreSQL provides functions for escaping identifiers, but these cannot be safely used inside client-side string interpolation without extra round-trips to the server and they don’t apply to SQL keywords.

The Goal

The goal, simply, is to be able to use our ordinary SQL injection toolkits for role management, and to ensure that the complexity of the queries we are issuing for role creation are kept to an absolute minimum.

In other words, instead of our previous example, it would be better to do this:

This is much easier to generate safely
1
2
3
4
5
6
SELECT roleman.create_role('read_only', array['inherit', 'nologin']);
SELECT roleman.grant_database('read_only', current_database(), array['connect']);
SELECT roleman.create_role('joe_analyst');
SELECT roleman.set_password('joe_analyst', 'somethingsecret', 'infinity');
SELECT roleman.grant_role('joe_analyst', 'read_only');
SELECT roleman.grant_table('joe_analyst', 'analytics.metrics', ARRAY['INSERT', 'SELECT', 'UPDATE']);

Suddenly all the queries are parameterised, and we can use all our normal anti-sql-injection tools. Additionally, rather than dealing with a different syntax for each type of statement, we have a consistent semantic structure, making tool creation much easier.

Finally in doing this we can place the responsibility for safe operation to the functions we call and only require that we pass in something valid. This helps guard against changes in our tools introducing accidental vulnerabilities.

The PostgreSQL server-side anti-SQL-Injection tools

On the server-side, PostgreSQL provides a very rich set of tools for preventing SQL injection. These include parameterised queries where possible (including with dynamic SQL in PL/PGSQL), solid escaping functions for both literals and identifiers, and some handy data-types which eliminate SQL injection in some cases. By using these methods effectively as applicable, we can reduce our original problem to one which is widely supported (where we pass in string literals into parameterised queries, even for role management functions).

As we have already noted, parameterised queries, server-side don’t apply to many of our cases, but there are two cases where they are very Helpful, namely in error handling and passing control between functions in the extension.

More frequently we will use escaping functions like quote_literal() and quote_ident() as well as types which represent strings bound to catalog entries, such as regclass and regprocedure. These types not only ensure proper escaping, but when strings are passed into the user-defined functions, they also ensure that the corresponding database object exists and is found (numbers cast to OID’s however are not so checked).

None of these tools address keywords, however. For keywords, we use a whitelist system. This means we have to add new keywords to the white lists as they are supported by PostgreSQL but it also means we are protected against SQL injection issues through this vector.

Initially supported use cases

In addition to our immediate use case, there were a few other use cases I knew of that I worked on supporting to ensure that the extension could be of use beyond Adjust. These all fit in with the idea that roleman should be safe toolkit for managing roles, and that it should do this job well.

Among others, supported use cases include:

  • Create simple scripts for creating SQL commands for role management to be run via psql. This requires restricting identifiers to the subset that does not require escaping.

  • Safely manage roles via application code. Applications that want to delegate security to the database have a safe framework for doing this.

  • A safe framework for database users to alter their own passwords with appropriate administrator-supplied security policies. Note that we are not doing this at Adjust but I know of others who are.

In the first case, you have the added complication that you cannot rely on the server-side protections against SQL injection to prevent SQL injection (though you can rely on client-side escaping of string literals). This is not our responsibility and roleman already makes the situation better by opening up more tools to be used to assure secure operation in this case.

The third case offers a particular problem, that any function which calls roleman.set_password() must do so in a security definer context. This means further that we must test against database object injection, not merely SQL language injection. In general the security guarantees that supporting scenario require are worth the extra effort in supporting them. Besides, someone will probably use the module in this way at some point so it is better if they don’t run into trouble over it.

SQL object injection is a form of SQL injection that we will discuss in more detail in the next post, as well as how to prevent it. As of Roleman 0.2.1 we specifically test against a wide range of SQL injection techniques including injecting objects that shadow expected objects. Version 0.2.1 is believed to be safe when properly used.

Stay Tuned

I hope you have enjoyed this article.

Next we will discuss the implementation of this extension. We will also discuss the various techniques used to tighten and ensure security against a wide range of attacks.

Comments