Comments on: Big Company Uses Product XYZ http://thoughts.davisjeff.com/2010/11/11/big-company-uses-product-xyz/ Ideas on Databases, Logic, and Language by Jeff Davis Tue, 19 Jun 2012 16:18:51 +0000 hourly 1 http://wordpress.org/?v=3.3.1 By: Richard Stephan http://thoughts.davisjeff.com/2010/11/11/big-company-uses-product-xyz/comment-page-1/#comment-1400 Richard Stephan Mon, 15 Nov 2010 13:22:38 +0000 http://thoughts.davisjeff.com/?p=362#comment-1400 It makes perfect sence for InnoDB to be the default storage engine now that Oracle owns MySQL. Not only for the technical reasons, but for the business reasons as well. Oracle has owned InnoDB for the last five years. Now they own the entire product. It makes perfect sence for InnoDB to be the default storage engine now that Oracle owns MySQL. Not only for the technical reasons, but for the business reasons as well. Oracle has owned InnoDB for the last five years. Now they own the entire product.

]]>
By: Robert Young http://thoughts.davisjeff.com/2010/11/11/big-company-uses-product-xyz/comment-page-1/#comment-1398 Robert Young Fri, 12 Nov 2010 20:55:54 +0000 http://thoughts.davisjeff.com/?p=362#comment-1398 flexible meaning: "you can take the defaults, but you can change storage and memory and blah to suit your needs". DB2 is particularly flexible. As to ANSI isolation levels, there's lots o folks who don't find MVCC to be the Holy Grail. If nothing else, MVCC databases routinely eat servers for breakfast. Oh, and nothing prevents read on write in DB2, just choose Read Uncommitted. As to MVCC, it amounts to Read On Last Commit, and as is often discussed on PostgreSQL and Oracle groups, that's why MVCC databases chew servers like peanuts. There Ain't No Such Thing As A Free Lunch. It's also why collisions and deadlocks become the user's responsibility later rather than the engine's responsibility early. "oops, you can't really update that row; it's already been updated by Sally." One can choose page size by table. One can choose buffer size by tablespace. One can assign tables to arbitrary tablespaces. One can have covering indexes. And the list goes on. Again, the number of folks that Oracle/IBM/M$ dedicate to their engines is likely an order of magnitude greater than what PostrgreSQL has. How an Open Source product moves is, often, a Squeakiest Wheels rather than Grand Vision thing. Python excepted. In the case of PG, the storage model and memory model can be a lot more like DB2 (fur instance) if Enterprise is the way the community wants to go. Or not. But if PG community does want to be an Enterprise DB, then it needs to think seriously along such lines. flexible meaning: “you can take the defaults, but you can change storage and memory and blah to suit your needs”. DB2 is particularly flexible. As to ANSI isolation levels, there’s lots o folks who don’t find MVCC to be the Holy Grail. If nothing else, MVCC databases routinely eat servers for breakfast. Oh, and nothing prevents read on write in DB2, just choose Read Uncommitted. As to MVCC, it amounts to Read On Last Commit, and as is often discussed on PostgreSQL and Oracle groups, that’s why MVCC databases chew servers like peanuts. There Ain’t No Such Thing As A Free Lunch. It’s also why collisions and deadlocks become the user’s responsibility later rather than the engine’s responsibility early. “oops, you can’t really update that row; it’s already been updated by Sally.”

One can choose page size by table. One can choose buffer size by tablespace. One can assign tables to arbitrary tablespaces. One can have covering indexes. And the list goes on.

Again, the number of folks that Oracle/IBM/M$ dedicate to their engines is likely an order of magnitude greater than what PostrgreSQL has.

How an Open Source product moves is, often, a Squeakiest Wheels rather than Grand Vision thing. Python excepted. In the case of PG, the storage model and memory model can be a lot more like DB2 (fur instance) if Enterprise is the way the community wants to go. Or not. But if PG community does want to be an Enterprise DB, then it needs to think seriously along such lines.

]]>
By: Simon Riggs http://thoughts.davisjeff.com/2010/11/11/big-company-uses-product-xyz/comment-page-1/#comment-1397 Simon Riggs Fri, 12 Nov 2010 20:40:07 +0000 http://thoughts.davisjeff.com/?p=362#comment-1397 "by far the most flexible" is not something you can sell me on either. You mean flexible, as in requiring one data type for text data 2000 bytes (Oracle). Or flexible as in not running on anything by Windows, with known scalability problems (SQLServer). Or flexible like preventing read SQL while a writes are taking place, and upgrading locks to table level if you lock too many rows (DB2). Open source gives people what they need by delivering many small usability enhancements and peer review of code means things work the way they should, not the way the marketing department thinks is probably OK. “by far the most flexible” is not something you can sell me on either.

You mean flexible, as in requiring one data type for text data 2000 bytes (Oracle). Or flexible as in not running on anything by Windows, with known scalability problems (SQLServer). Or flexible like preventing read SQL while a writes are taking place, and upgrading locks to table level if you lock too many rows (DB2).

Open source gives people what they need by delivering many small usability enhancements and peer review of code means things work the way they should, not the way the marketing department thinks is probably OK.

]]>
By: Mark Callaghan http://thoughts.davisjeff.com/2010/11/11/big-company-uses-product-xyz/comment-page-1/#comment-1395 Mark Callaghan Fri, 12 Nov 2010 07:09:26 +0000 http://thoughts.davisjeff.com/?p=362#comment-1395 There are many things that make PG appealing to external developers like me. Including, far more is discussed in the open (excluding the interesting conversations at Greenplum, AsterData and the like), code compiles without warnings, all developers are external, the community process is more mature. Alas, that is not an option for me today. There are many things that make PG appealing to external developers like me. Including, far more is discussed in the open (excluding the interesting conversations at Greenplum, AsterData and the like), code compiles without warnings, all developers are external, the community process is more mature.

Alas, that is not an option for me today.

]]>
By: Jeff Davis http://thoughts.davisjeff.com/2010/11/11/big-company-uses-product-xyz/comment-page-1/#comment-1394 Jeff Davis Fri, 12 Nov 2010 04:28:51 +0000 http://thoughts.davisjeff.com/?p=362#comment-1394 "they should mean that you can be successful using product X" I would add to that "...along with a huge amount of other engineering effort around and within X to get X to do what you really need". "PG isn’t immune from having patches and forks." Absolutely true. I work for such a company. The difference I think is that postgres, as a community, seems to have a more authoritative mainline offering and less confusion over patches, forks, and even options. Equally as important, the postgresql mainline accepts contributions without copyright assignment. That being said, I get the impression that MySQL is standardizing on InnoDB for the vast majority of uses now that it's the default in mainline. I think that's a very positive step (observing mostly from outside, of course). “they should mean that you can be successful using product X”

I would add to that “…along with a huge amount of other engineering effort around and within X to get X to do what you really need”.

“PG isn’t immune from having patches and forks.”

Absolutely true. I work for such a company. The difference I think is that postgres, as a community, seems to have a more authoritative mainline offering and less confusion over patches, forks, and even options. Equally as important, the postgresql mainline accepts contributions without copyright assignment.

That being said, I get the impression that MySQL is standardizing on InnoDB for the vast majority of uses now that it’s the default in mainline. I think that’s a very positive step (observing mostly from outside, of course).

]]>
By: Mark Callaghan http://thoughts.davisjeff.com/2010/11/11/big-company-uses-product-xyz/comment-page-1/#comment-1393 Mark Callaghan Fri, 12 Nov 2010 03:15:25 +0000 http://thoughts.davisjeff.com/?p=362#comment-1393 Yes, I was ambiguous. "Company A uses product X at web-scale" was said in response to the claim in the rebuttal by JD that you can't use MySQL once you get big. I agree that these references get too much credit in the press. They should mean that you can be successful using product X but are misinterpreted to mean that you will be successful using it. PG isn't immune from having patches and forks. PGWest was hosted by the vendor of one (EnterpriseDB) and the rebuttal came from JD whose company does Mammoth Replicator, a PG replication fork/patch. Skype was cited as a company that runs PG at web-scale. I haven't found many details on that deployment (another company X uses product Y reference). But they also have patches (SkyTools) to make it work for them. Yes, I was ambiguous. “Company A uses product X at web-scale” was said in response to the claim in the rebuttal by JD that you can’t use MySQL once you get big.

I agree that these references get too much credit in the press. They should mean that you can be successful using product X but are misinterpreted to mean that you will be successful using it.

PG isn’t immune from having patches and forks. PGWest was hosted by the vendor of one (EnterpriseDB) and the rebuttal came from JD whose company does Mammoth Replicator, a PG replication fork/patch.

Skype was cited as a company that runs PG at web-scale. I haven’t found many details on that deployment (another company X uses product Y reference). But they also have patches (SkyTools) to make it work for them.

]]>
By: Robert Young http://thoughts.davisjeff.com/2010/11/11/big-company-uses-product-xyz/comment-page-1/#comment-1392 Robert Young Fri, 12 Nov 2010 01:11:38 +0000 http://thoughts.davisjeff.com/?p=362#comment-1392 Having worked with all of the mentioned engines, Stonebraker and Cattell are quite full it; likely attempting to sell whatever snake oil is their current batch. There is a world of difference among these engines. No two of them even have semantically equivalent ACID implementations; although Oracle/PostgreSQL are the two closest. So far as knobs and switches go, the three commercial engines are by far the most flexible. But they should be; they have legions of coders. Having worked with all of the mentioned engines, Stonebraker and Cattell are quite full it; likely attempting to sell whatever snake oil is their current batch.

There is a world of difference among these engines. No two of them even have semantically equivalent ACID implementations; although Oracle/PostgreSQL are the two closest. So far as knobs and switches go, the three commercial engines are by far the most flexible. But they should be; they have legions of coders.

]]>
By: Jeff Davis http://thoughts.davisjeff.com/2010/11/11/big-company-uses-product-xyz/comment-page-1/#comment-1391 Jeff Davis Thu, 11 Nov 2010 23:24:27 +0000 http://thoughts.davisjeff.com/?p=362#comment-1391 Hopefully I didn't misrepresent your statements. I was trying to pick out a certain aspect that I thought was under-analyzed -- in particular, loose phrases such as "runs on" and "uses". Also, this applies to every discussion of the form "Big Company runs on XYZ; therefore XYZ must be good.". I didn't mean that yours was the only article that glossed over words like "runs on" -- I think it's quite common, which is why I posted this. Hopefully I didn’t misrepresent your statements. I was trying to pick out a certain aspect that I thought was under-analyzed — in particular, loose phrases such as “runs on” and “uses”.

Also, this applies to every discussion of the form “Big Company runs on XYZ; therefore XYZ must be good.”. I didn’t mean that yours was the only article that glossed over words like “runs on” — I think it’s quite common, which is why I posted this.

]]>
By: Jeff Davis http://thoughts.davisjeff.com/2010/11/11/big-company-uses-product-xyz/comment-page-1/#comment-1390 Jeff Davis Thu, 11 Nov 2010 23:19:52 +0000 http://thoughts.davisjeff.com/?p=362#comment-1390 That's a little confusing, because you said "that statement requires context" without identifying the statement ;) I believe you're referring to a statement you made at PG WEST, but I'm not 100% sure. This post wasn't so much about the statement itself, but how it is being analyzed. That's why I tried to step away from specific company names, because I just don't know enough of the details. The main point is that "using" a DBMS means very different things to different organizations. That's even more true when it's an open source DBMS, because the large organization probably modifies it significantly; but it applies to the closed systems as well. While I don't know much about Facebook specifically, I'm fairly sure that it takes a huge amount of custom code and engineering to go from stock MySQL to the architecture that supports Facebook's data management. That’s a little confusing, because you said “that statement requires context” without identifying the statement ;) I believe you’re referring to a statement you made at PG WEST, but I’m not 100% sure.

This post wasn’t so much about the statement itself, but how it is being analyzed. That’s why I tried to step away from specific company names, because I just don’t know enough of the details.

The main point is that “using” a DBMS means very different things to different organizations. That’s even more true when it’s an open source DBMS, because the large organization probably modifies it significantly; but it applies to the closed systems as well. While I don’t know much about Facebook specifically, I’m fairly sure that it takes a huge amount of custom code and engineering to go from stock MySQL to the architecture that supports Facebook’s data management.

]]>
By: Mark Callaghan http://thoughts.davisjeff.com/2010/11/11/big-company-uses-product-xyz/comment-page-1/#comment-1389 Mark Callaghan Thu, 11 Nov 2010 20:37:04 +0000 http://thoughts.davisjeff.com/?p=362#comment-1389 That statement requires context. It was made in response to JD's claim that once you get big, you can't run MySQL. It doesn't imply that I think MySQL is better or worse than PG. I think that PG is awesome and rocks (but I have yet to acquire my "PG Rocks" t-shirt). That statement requires context. It was made in response to JD’s claim that once you get big, you can’t run MySQL.

It doesn’t imply that I think MySQL is better or worse than PG. I think that PG is awesome and rocks (but I have yet to acquire my “PG Rocks” t-shirt).

]]>