Hello my dear friends. Today we will talk about PostgreSQL and most useful modules for this database.
PostgreSQL is an object-relational database management system (ORDBMS). The main features of PostgreSQL are:
- database support of virtually unlimited size;
- powerful and reliable mechanisms for transactional replication;
- built-in extensible programming languages;
- easy expandability by extensions.
The last feature allows to extend the capabilities of PostgreSQL. Let’s look at some of these extensions.
PostGIS adds support of geographic objects to the PostgreSQL object-relational database. In fact, PostGIS “spatially enables” the PostgreSQL server, allowing it to be used as a backend spatial database for geographic information systems (GIS), much like ESRI’s SDE or Oracle’s Spatial extension. PostGIS follows the OpenGIS “Simple Features Specification for SQL” and has been certified as compliant with the “Types and Functions” profile.
This extension is very useful if you want do location-based queries in database, such as “find the closest N items to this location.”.
PostPic is an extension that enables image processing inside the database, like PostGIS does for spatial data. It adds the new ‘image’ type to the SQL, and several functions to process images (crop, resize, rotate, etc) and to extract their attributes (width, height, date, etc.).
PL/Proxy is database partitioning system implemented as PL language. Main idea is that proxy function will be created with same signature as remote function to be called, so only destination info needs to be specified inside proxy function body.
Texcaller is a convenient interface to the TeX command line tools that handles all kinds of errors without much fuzz. It is written in plain C, is fairly portable, and has no external dependencies besides TeX.
PgMemcache is a set of PostgreSQL user-defined functions that provide an interface to memcached. Installing pgmemcache is easy, but does have a few trivial requirements – libmemcached 0.38 or newer and PostgreSQL 8.4 or newer. With this extension the PostgreSQL can write, read, search, and delete data from memcached.
The prefix project implements text prefix matches operator (prefix @> text) and provide a GiST opclass for indexing support of prefix searches. Typical query like the following is now able to benefit from index lookups, as soon as the prefix column is of type prefix_range: SELECT * FROM prefixes WHERE prefix @> ’0123456789′ ORDER BY length(prefix::text) LIMIT 1
pgSphere contains methods for working with spherical coordinates and objects. It also supports indexing of spherical objects. Used to work with geographic (can be used instead of the PostGIS) or astronomical data types.
As of version 9.1, PostgreSQL can link to other systems to retrieve data via Foreign Data Wrappers (FDWs). These can take the form of any data source, such as a file system, another RDBMS, or a web service. That means that the regular database queries can use these data sources like regular tables, and even join multiple data sources together. Many extensions for FDW is available:
- oracle_fdw – connects to Oracle databases.
- mysql_fdw – connects to MySQL databases.
- tds_fdw – connects to Sybase and Microsoft SQL Server databases.
- odbc_fdw – connects to any ODBC source.
- couchdb_fdw – connects to CouchDB databases.
- redis_fdw – connects to Redis databases.
- twitter_fdw – fetches messages from Twitter.
- ldap_fdw – queries LDAP servers.
- file_fdw – access data files in the server’s file system and query them as tables.
- file_fixed_length_record_fdw – reads flat files with columns of fixed width.
- PGStrom – uses GPU devices to accelerate sequential scan on massive amount of records with complex qualifiers.
- s3_fdw – reads files located in Amazon S3.
- www_fdw – accesses web services as a data source.
Multicorn is an extension that enables Foreign Data Wrappers to be written in Python. There are a number of foreign data wrappers it provides as standard:
- multicorn.sqlalchemyfdw – used to access data stored in any database supported by the sqlalchemy python toolkit, such as MySQL, SQLite, Oracle, Microsoft SQL Server and many others.
- multicorn.fsfdw – used to access data stored in various files, in a filesystem.
- multicorn.csvfdw – used to access data stored in CSV files.
- multicorn.rssfdw – used to access items from an RSS feed.
This module implements a data type hstore for storing sets of (key,value) pairs within a single PostgreSQL data field. This can be useful in various scenarios, such as rows with many attributes that are rarely examined, or semi-structured data. hstore has GiST and GIN index support for the @>, ?, ?& and ?| operators. hstore also supports btree or hash indexes for the = operator. This allows hstore columns to be declared UNIQUE, or to be used in GROUP BY, ORDER BY or DISTINCT expressions. The sort ordering for hstore values is not particularly useful, but these indexes may be useful for equivalence lookups.
The intarray module provides a number of useful functions and operators for manipulating null-free arrays of integers. There is also support for indexed searches using some of the operators.
All of these operations will throw an error if a supplied array contains any NULL elements. I used this extension when developed simple method for comparison of images by PostgreSQL (http://leopard.in.ua/2010/12/09/bystroe-sravnenie-izobrazhenij-s-pomoshhyu-rubyphp-i-postgresql/, russian article).
Adds support for connections to other PostgreSQL databases from within a database session. Also dblink allows to do autonomous transactions (like in Oracle).
That’s all folks!