Managing PostgreSQL

with Puppet

Chris Everest

Resources

www.covermymeds.com/main/careers
scriptscribe.org
github.com/covermymeds

CoverMyMeds, Inc

  • Service Oriented Architecture (SOA)
  • Migration from SQL Server to PostgreSQL
  • Multiple Application Environments

Environments are elastic and we need total automation.

Production

App Instances50
Prod Databases8

Integration (++)

Environments 3
App Instances(3x50)150
Databases(3x8)24

Testing, Development... (∞)

Testing Apps(8x50)400
Vagrant Apps(50x50)2500
Databases(8x8) + (8x50)<eek>
You get a database

Puppet Labs

https://forge.puppetlabs.com/puppetlabs/postgresql

Create a Server

class { 'postgresql::server': }
            

  • Installs official package repos
  • specifically for your distro
  • Installs postgresql packages
  • Initializes the pgdata directory
  • Starts up the the database server
Create a Database + User

postgresql::server::db { 'covermymeds':
  user     => 'cmm_user',
  password => postgresql_password('cmm_user', 'supersecret  '),
}
            

  • Creates a database
  • Creates a role, if it doesn't exist
  • Creates permissions
  • GRANT ALL ON DATABASE "patients"
Create Database Roles

# Create the Role
postgresql::server::role { 'app_user': }
  password_hash => postgresql_password('app_user', 'secret'),
}

# Apply GRANTS
postgresql::server::database_grant {'app_user':
  privilege => 'ALL',
  db        => 'covermymeds',
  role      => 'app_user',
}
            

  • Creates a new role: app_user
  • Creates permissions on database
  • GRANT ALL ON DATABASE 'covermymeds' TO 'patients'
Now Scale ⇪
  • Add 50 more app roles
  • But... not all roles get 'GRANT ALL'
  • And, some roles only get 'SELECT'

(...sigh)

Dig in!

Puppet makes this all super easy, we just have to look deeper.

- Separate the config steps into workable, repeatable pieces.
- Server Config, Roles, Databases and Permissions
- Hieradata for structured yaml config
- Extreme patience

The secret sauce is Puppet Lab's custom resource type.

The Psql Resource Type


postgresql_psql { "unique_name":
  command    => 'ANY COMMAND THAT PSQL SUPPORTS',
  db         => 'DATABASE TO ACT ON',
  unless     => 'SQL COMMAND TO CHECK IF ACTION WAS ALREADY TAKEN',
  require    => Class['postgresql::server']
}
          

Hieradata

Wait... hiera what?

  • yaml based configuration files
  • hierarchical mapping of serialized data

Hieradata

We set up two data structures for hiera

  1. DB Server to Application Mapping
  2. Application to DB Role Mapping

List of Apps

Define this data structure at the host level, so the app list can be moved around to different database clusters. Define this on our master only.

# DB Cluster to Application Mapping
cmm_pgsql::dblist::cluster1:
  mydatabase:
    apps:
      - myapp1
      - myapp2
      - myapp3

# Equivilant to a hash of arrays
$myhash = hiera('cmm_pgsql::dblist::cluster1')
# yields => {
#  'mydatabase' => {
#    'apps'     => [
#      'myapp1',
#      'myapp2',
#      'myapp3',
#    ],
#  },
#}
          

List of Roles

Define this data structure at the environment or common level. Any other puppet module can benefit from access to database credentials. Use this data structure for both PostgreSQL builds and to provide database configs to applications.

# Application to Role Mapping
app::dbconfig::myapp1:
  mydatabase:
    write_handle:
      default_handle: true
      role: default_write
      host: vip-cluster1
      database: mydatabase
      adapter: postgres
      username: myapp1_user
      password: password
# Equivalent to a hash of arrays
$myhash = hiera('cmm_pgsql::dbconfig::myapp1')
# yields => {
#  'mydatabase'     => {
#    'write_handle' => {
#      'default_handle' => true,
#      'role'           => 'default_write',
#      'host'           => 'vip-cluster1',
#      'database'       => 'mydatabase',
#      'adapter'        => 'postgres',
#      'username'       => 'myapp1_user',
#      'password'       => 'password',
#    },
#  },
#}
          

Introducing:

The meat and potatoes of our database infrastructure

https://github.com/covermymeds/cmm_pgsql
manifests/init.pp

The main class determines master vs. slave and failover status. Masters do everything, slaves do almost nothing (they're read only). Hosts in failover do absolutely nothing.

Statuses are determined by lib/facter/pg_replication.rb

manifests/setup.pp

Base setup will install requirements like SSL keys for client connections, other supporting packages (pg_top, pg_repack, etc), Postgres configuration and Shinken (nagios) Monitors. Masters and slaves alike get base setup configuration.

Hieradata

Again we'll use the simplicity of hieradata configs to define postgresql.conf. This allows us to set a common server config and also set per-server configs (custom tuning FTW).
# Define any valid pg config value
cmm_pgsql::config:
  'hot_standby_feedback':
    value: 'on'
  'ssl':
    value: 'on'
  'hot_standby':
    value: 'on'
  'max_connections':
    value: '100'
  'shared_buffers':
    value: '1024MB'
  'effective_cache_size':
    value: '3072MB'
  'checkpoint_segments':
    value: '32'
  'checkpoint_completion_target':
    value: '0.9'
  'default_statistics_target':
    value: '100'
  'work_mem':
    value: '16MB'
          
manifests/master.pp

This class performs setup of master related tasks.

  • Create administrative users (not postgres) for taking backups and handling deployments
  • Create default permissions on the public schema; REVOKE CREATE by default
  • Determines whether any application roles and databases need to be created

At this point, creation of application credentials is started with a lookup of cmm_pgsql::dblist::<cluster>

manifests/appdb.pp

From manifests/master.pp calls to appdb { 'db1': ['app1', 'app2', 'app3'] } will generate a single database creation. After the database is created, it then makes calls to manifests/dbhandle.pp for each app using that database.

manifests/dbhandle.pp

The dbhandle type is essentially a hackzor (see dirty little secrets) to link the database creation with the application credentials. This takes the name parameter and allows unique creation of the db role credentials by splitting on a pre-concatenated resource name.

manifests/appuser.pp

Finally, the PostgreSQL roles are created and permissions are satisfied with SQL Grant templates. This is very powerful, yet simple to implement.

# Create a unique resource name
$role_grants = "role:${role} ${username}@${database}"

# Only define the role if it doesn't exist
unless defined(Postgresql_psql[$role_grants]) {

  postgresql_psql { $role_grants:
    command    => template("cmm_pgsql/grants/${role}.sql.erb"),
    db         => $database,
    psql_user  => $::postgresql::server::user,
    psql_group => $::postgresql::server::group,
    psql_path  => $::postgresql::server::psql_path,
    unless     => "SELECT nspname, defaclobjtype, defaclacl
                 FROM pg_default_acl a
                 JOIN pg_namespace b ON a.defaclnamespace=b.oid
                 WHERE defaclobjtype = 'r'
                 AND aclcontains(defaclacl, '\"${username}\"=r/postgres')",
    require    => Postgresql::Server::Role[$username],
  }

}
          
SQL Grant Templates

Leverage the postgresql_psql resource once again to apply grants. Passing in a template, the list of Grants will execute against our database.

# Template of grants, with username substituted
GRANT select, insert, update, delete ON ALL TABLES IN SCHEMA public
  TO "<%= @username %>";

GRANT execute ON ALL functions IN SCHEMA public
  TO "<%= @username %>";

GRANT USAGE ON ALL sequences IN SCHEMA public
  TO "<%= @username %>";

ALTER DEFAULT PRIVILEGES IN SCHEMA public
  GRANT select, insert, update, delete
  ON TABLES TO "<%= @username %>";

ALTER DEFAULT PRIVILEGES IN SCHEMA public
  GRANT execute ON functions to "<%= @username %>";

ALTER DEFAULT PRIVILEGES IN SCHEMA public
  GRANT usage ON sequences to "<%= @username %>";
        
Verify Read Permissions

Remember the 'unless' parameter? This allows a simplified method of permission verification

# Determine if role permissions have been applied
SELECT nspname, defaclobjtype, defaclacl
  FROM pg_default_acl a
  JOIN pg_namespace b ON a.defaclnamespace=b.oid
  WHERE defaclobjtype = 'r'
  AND aclcontains(defaclacl, '"username"=r/postgres')

## Output
 nspname | defaclobjtype |                   defaclacl
---------+---------------+---------------------------------------------
 public  | r             | {username=r/postgres,username=arwd/postgres}
(1 row)
          

Data Flow

Dirty Little Secrets

  • A custom prefix_keys() function allows us to uniquify data structures. Credit: competa.com
  • Some Puppet versions can't 'loop' yet. Build small, nested defined types and feed the data required to them.
  • We wanted to use the 'unless' parameter to fully audit all grants and permissions.
  • Our module is custom. We encourage pull requests.

There's More?

  • Stand up your slaves via scripts
  • Manage backups
  • SSL keys for client connetions
  • Coming Soon... "DB Schema Deployer"