Indy Big Data meetup - December 2015

Created: by Pradeep Gowda Updated: May 29, 2020

Talk by Josh Berkus, core member of PostgreSQL.

Big data features

  • sophisticated sql parser/planner
  • works with most DW tools
  • many large DB features (partitioning etc.,)
  • Foreign Data Wrappers (FDW)
  • stability and reliability

Data sizing:

  • comfort zone: 1G to 10TB
  • possible: 10TB to 100TB
  • With extensions: ??

9.5 status:

  • beta 2 released Nov 12, 2015.

9.5 big data features:

  • UPSERT
  • BRIN indexes
  • GROUPING SETS/CUBE
  • Foreing Schema
  • Faster sorts
  • Misc

UPSERT: INSERT ... ON CONFLICT DO { UPDATE | IGNORE }, similar to MERGE only better. Merge syntax was complex and did not support some of the use cases (single row insert or update). Merge is available in other DBs (Oracle?)

Why you want Upsert:

  • works for large sets of rows
  • supports complex logic
  • supports cascading
  • rewriting procedural ETL logic as single SQL statements
  • faster (sometimes)

How it works – It probes the index (which means BTree index), (caveat?), locks the leaf page. If we try concurrent upserts they will fail because of lock, but will not overwrite data (??).

BRIN – Block-Range INdex:

  • new index type
  • also known as “skiplist” index
  • good for naturally ordered data.
  • WHY? - very small indexes for very large data. (BTree:2142MB::BRIN:(64kb for 512, 192kb for 64))
  • Table is divided into blocks. for each block store min and max value.
  • lot faster to load index. (second quandrant tested this using TPC-H Lineitem data; 95:BTree::18:BRIN)

GROUPING, SETS, ROLLUP and CUBE:

ROLLUP – get subtotals and totals with detail in one query.

SELECT country, city, count(*)
FROM lwn_subscribers
GROUP by rollup (country, city);

CUBE – explosiion of all summaries for use with OLAP tools (e.g: Mondrian, Pentaho etc.,). Earlier we had to do this externally (say Java). (CUBE output is not meant to be human-readable).

GROUPING SETS - superset/programmable version of both ROLLUP and CUBE.

SELECT ...
FROM ...
GROUP by grouping sets ((city, level), (level), ());

Abbreviated key sorting

  • sort varlena data by key prefix
  • much faster index build
  • 2x to 20x faster
  • works with TEXT, VARCHAR, ..
  • no extra work.. just upgrade to 9.5.

Foreign Schema

  • Foreign Data wrapper
  • let pg access external data like it was a local table.
  • example: redis (what are all my keys)
CREATE FOREIGN TABLE myredishash
(key text, val text[])
SERVER redis_server
...

INSERT INTO myredishash (key, val)
VALUES ('mytable:r1, '{prop1, val1..}');

FDW to query other PostgreSQL instances,Hadoop, cassandra, kafka, spreadsheet, twitter(REST), multicorn (Python).

Import Foreign Schema (SQL DBs):

IMPORT FOREIGN SCHEMA public
EXCEPT (reports, audit)
FROM server acctng_server into acctng;

currently on “postgresql-like” databases.

Foreign Table inheritence:

  • Data federation using postgresql paritioning
CREATE FOREIGN TABLE users_shard_1 ()
INHERITS (users) server shard_1;

CREATE FOREIGN TABLE users_shard_2 ()
INHERITS (users) server shard_2;

citusdb uses FTI to store in columnstore (cstore) tables.

More features:

SET LOGGED/UNLOGGED:

  • Use unlogged tables for ELT.
  • Google summer of code project
  • You can change logged/unlogged post-facto.
  • ALTER TABLE july_data SET LOGGED

GiST index-only Scan:

Index-only scan for PostGIS, Exclusion

WAL compression:

Reduce PostgreSQL log writes during data loading; less writing => faster load.

TABLESAMPLE:

Get a “quick look” at data in a big table.

SELECT * FROM user_profiles
TABLESAMPLE BERNOULLI (0.001);
  • Sampling algorithsm available now – system, bernoulli.

Beyond 9.5:

Parallel seq scan; finally, parallel query for postgresql.

set max_parallel_degree = 4 ;
select * from pgbench_accounts
where filler like '%a%';

cstore & pg_shard

  • extensions from citusdata
  • cstore: column store for PostgreSQL
  • pg_shard: automated DW sharding – automated sharding of tables to remote shards. Aimed at big data use case.
  • idea behind this: retrofitting to existing database instance that you never thought will get that big.

Big data forks go OSS

  • PipelineDB: August 2015. is a streaming database. “Spark, but relational”
  • Greenplum: November 2015. Still has the most mature “MPP”, since it is the most battle-tested.
  • CitusDB: January 2016. Opensource as an extension. can be retrofitted to existing pg instl.

Questions:

  • “Recursive queries? (for graph databases)” - the perfect GraphDB is impossible. it can be implmented for one kind of graph. PostgreSQL does well on large forest of small bushes. Idea: “use PG for storage. Write a query planner for graph querying”.

Links:

Additional reference: