Wednesday, February 25, 2015

Stopping expensive queries before they start

Today we're releasing a code for a small PostgreSQL module called plan_filter that lets you stop queries from even starting if they meet certain criteria. Currently the module implements one such criterion: the estimated cost of the query.

After you have built and installed it, you add a couple of settings to the postgresql.conf file, like this:
shared_preload_libraries = 'plan_filter'
plan_filter.statement_cost_limit = 100000.0
Then if the planner estimates the cost as higher than the statement_cost_limit it will raise an error rather than allowing the query to run.

This module follows an idea from a discussion on the postgresql-hackers mailing list some time ago. It was developed by PostgreSQL Experts Inc for our client Twitch.TV, who have generously allowed us to make it publicly available.

Raspberry Pi 2 coming to the buildfarm

Yesterday I ordered a Raspberry Pi 2 Model B, and it should be delivered in a few days. I'm intending to set it up as a buildfarm member. The fact that you can purchase a general purpose computer the size of a credit card with  a quad-core processor and 1Gb of memory (I remember when RAM was counted in kilobytes) and all for USD35.00 is amazing, even when you remember Moore's Law.

Saturday, February 14, 2015

Statistics and ordering operations on JSON fields

The JSON type is not provided with any comparison operators built in, unlike the new JSONB type. One reason for this is that it's not really clear how to do the comparison. In particular, comparing the text values seems wrong, because the white space should not matter, and arguably the order of object keys should not matter either. JSONB doesn't have these problems because it dissolves all the white space and stores object keys in a canonical order, so we have come up with sane if slightly counter-intuitive comparison operations.

This limitation on JSON is somewhat irksome, however. It restricts you from doing some operations on JSON such as DISTINCT, GROUP BY, and ORDER BY.

Another issue is that it causes ANALYZE not to create any rows at all in pg_statistic for JSON columns, so even if all you want to know is the average column width, you can't find it. That makes doing things like measuring table bloat just about impossible.

If you have PLV8 available, you can create operators that work fairly sanely on JSON and that let you generate stats, use DISTINCT etc. The steps are outlined here.

But what if you don't want to load PLV8 just for this? Or what of you can't, like say on a managed service that doesn't provide it? All is not lost.  Here is a version which uses text comparison instead of a PLV8 function.  That means you don't need to have PLV8 loaded. Since it uses text comparison, it is subject to the caveats mentioned about about white space and object keys. But it will work, and you will see rows for the column in pg_statistic. Just be careful using ordering operations or creating indexes, as the results, while consistent, might be surprising.

Monday, February 9, 2015

Moving PLV8 to ECMAScript6

Taras Mitran has just published a blog post about using some of the advanced features of ECMAScript version 6 to make programming with SQL in JavaScript nicer.

He notes that Template Strings allow for multiline strings, which is in itself a significant advance, and that you can also have Tagged Template Strings which can transform the Template String automagically in many interesting ways. His example turns a Template String with embedded variables into a preparable query with a list of parameters. It's very nifty, quite clever in fact.

The other feature that I would really like is proper lexically scoped variables. Perl got these donkeys years ago, and their absence this long in JavaScript has been reprehensible. They are provided for in ES 6.

My understanding is that to get these features we need to provide for PLV8 to build against the latest version of V8 (and possibly provide some initialization flags too.) Unfortunately, the V8 API seems to have changed significantly since the 3.14.5.10 that's available on my Fedora 20 workstation, so enabling it to build with, say, V8 version 4.1.0.14, which is what io.js is using, will take quite a bit of work, probably by someone whose C++-fu is greater than mine.

Tuesday, February 3, 2015

New release of PLV8

I have released a new version of PLV8, which now builds on PostgreSQL 9.4, as well as containing a number of bug fixes.

It can be downloaded at http://pgxn.org/dist/plv8

Enjoy