PostgreSQL core developer Josh Berkus sat down with SearchEnterpriseLinux.com this week to talk about the state of the open source database, the upcoming PostgreSQL version 8.1, and why he believes it's only a matter of time before every database is open source.
What's new with PostgreSQL? What have the challenges been over the past year?
Berkus: The big news for PostgreSQL is the Microsoft Windows port, which is significant because I can tell you the download figures we now have show that 65% of downloads have been Windows. A lot of those are new users, because Windows ports weren't available before [for PostgreSQL].
For other stuff we have added enterprise features -- things like point-in-time recoveries -- are useful features, and are part of a kind of checklist for people thinking of using PostgreSQL in the data center because they want multiple avenues of back-up.
Save points and nested transactions are things we have wanted to implement because they enable all kinds of things like good error handling within the database and can be very important to the [Java 2 Enterprise Edition] platform – the largest development platform in terms of population.
As far as the future is concerned, data warehousing is the next hurdle to prove that we can be doing the same work as Oracle, DB2 and SQL Server.
How has the goal of infiltration into the enterprise progressed over the past year?
Berkus: In terms of enterprise use, we had that back in very early July 2004. Lots of ISPs (Internet Service Providers) have been using PostgreSQL, and there are a lot of large companies that use PostgreSQL internally. Thousands of enterprise level businesses run PostgreSQL. For us it is a question of scope; how many enterprises can we be suitable for. Right now, if you run an ISP or interactive Web application or have heavy PP transactions at any level where you need big iron, then PostgreSQL is suitable. Right now, we are working not quite at parity for OLAP (online analytical processing) applications, because there has not been a lot of focus on optimization. The goal is to be the best database period.
A big thing with version 8.0 is that it has shown a real snowballing effect in our adoption development, because it's pure open source community and that's been very successful for us. PostgreSQL is up to couple hundred contributors, and more importantly, we've gained more contributors thanks to corporate sponsorships.
Previously, when talking about open source in the enterprise, you said it was a matter of 'not if, but when.' Why do you believe companies will eventually require an open source strategy?
Any technology company that is either a heavy user of software, depends on software, or is somehow involved with hardware and does not have an open source strategy is headed for extinction already. Now that strategy with open source can be 'how to go around open source.' MS certainly has open source strategy in that respect. But it has to be some kind of strategy. For open source to go further in the database, it is only a matter of time. [And it also won't be] a whole lot of time before all databases are open source, with the exception being SQL Server.
Why has the database become one of the most likely parts of the software stack to go open source?
Berkus: There are two reasons for that, one simply being that we already have a strong offering in the open source space with databases like PostgreSQL and MySQL. In the embedded space, Cloudscape is also a good option. There has also been a sign from the industry over the last year as vendors like IBM and Computer Associates [take on free] open source projects. Sybase has limited edition release for free as well. This is all showing me that proprietary vendors are seeing that the days of making money off database licensing are ending.
The reason why we have strong open source offerings is because in the open source space it is always easier to build middleware and back end stuff. A PostgreSQL user is going to be a DBA of some sort and will already have programming skills. And out of that crop of users you are going to have half-a-dozen people contributing solid PostgreSQL open source code. The developer community grows proportionally so that your development can keep up pace with membership.
A second point as to why databases are stronger candidates is that they are not user facing open source technology. With an OpenOffice, a much smaller number of users are programmers and therefore the number who can continue is small. With OpenOffice, if I make change to tool bars then some users are going to like it some are going to hate it. With PostgreSQL, I can just add a memory patch and know if it's good or bad across the board. So that's why back end server stuff has developed in open source so much faster -- if programmers use it then programmers fix it.
Should companies that have only recently started an open source program be considered by users when there are clearly those who have been in the open source game much longer?
Berkus: The difference is a matter of business models. MySQL makes money by selling MySQL licenses and support and by being the sole provider, just like any proprietary company. The only real difference is they leverage open source as their distribution model. On the other hand, multiple companies are involved with PostgreSQL and are also selling other things modeled for the database. Their interest is not in making money off PostgreSQL, it's in making hardware that happens to include PostgreSQL. They're not concerned with getting direct return because they're getting a substantial indirect return.
Any additional thoughts on the future and version 8.1?
Berkus: The news is definitely 8.1 developments. It's early in the development stage and things look very good. Two phase commit will probably get into 8.1, as well as more memory usage enhancements. OSDL does lots and lots of testing for us and they have been a huge help in improving the performance of PostgreSQL. Memory enhancements have increased by up to 40%.