Haskell is a very interesting language, and shows up on sites like http://programming.reddit.com frequently. It’s somewhat mind-bending, but very powerful and has some great theoretical advantages over other languages. I have been learning it on and off for some time, never really getting comfortable with it but being inspired by it nonetheless.
But discussion on sites like reddit usually falls a little flat when someone asks a question like:
If haskell has all these wonderful advantages, what amazing applications have been written with it?
The responses to that question usually aren’t very convincing, quite honestly.
But what if I told you there was a wildly successful language, in some ways the most successful language ever, and it could be characterized by:
- lazy evaluation
- declarative
- type inference
- immutable state
- tightly controlled side effects
- strict static typing
Surely that would be interesting to a Haskell programmer? Of course, I’m talking about SQL.
Now, it’s all falling into place. All of those theoretical advantages become practical when you’re talking about managing a lot of data over a long period of time, and trying to avoid making any mistakes along the way. Really, that’s what relational database systems are all about.
I speculate that SQL is so successful and pervasive that it stole the limelight from languages like haskell, because the tough problems that haskell would solve are already solved in so many cases. Application developers can hack up a SQL query and run it over 100M records in 7 tables, glance at the result, and turn it over to someone else with near certainty that it’s the right answer! Sure, if you have a poorly-designed schema and have all kinds of special cases, then the query might be wrong too. But if you have a mostly-sane schema and mostly know what you’re doing, you hardly even need to check the results before using the answer.
In other words, if the query compiles, and the result looks anything like what you were expecting (e.g. the right basic structure), then it’s probably correct. Sound familiar? That’s exactly what people say about haskell.
It would be great if haskell folks would get more involved in the database community. It looks like a lot of useful knowledge could be shared. Haskell folks would be in a better position to find out how to apply theory where it has already proven to be successful, and could work backward to find other good applications of that theory.
Competing directly in the web application space against languages like ruby and javascript is going to be an uphill battle even if haskell is better in that space. I’ve worked with some very good ruby developers, and I honestly couldn’t begin to tell them where haskell might be a practical advantage for web application development. Again, I don’t know much about haskell aside from the very basics. But if someone like me who is interested in haskell and made some attempt to understand it and read about it still cannot articulate a practical advantage, clearly there is some kind of a problem (either messaging or technical). And that’s a huge space for application development, so that’s a serious concern.
However, the data management space is also huge — a large fraction of those applications exist primarily to collect data or present data. So, if haskell folks could work with the database community to advance data management, I believe that would inspire a lot of interesting development.
Comparing SQL and Haskell is quite an entertaining exercise. What if we could query SQL by authoring Haskell? What if GHC compiled some parts of your software to run inside your database engine? Haskell already takes control of determining execution flow.
“What if GHC compiled some parts of your software to run inside your database engine?”
That’s a particularly interesting question.
Well, PL/Haskell does seem to be in development, but has not yet been officially integrated into PostgreSQL yet.
Yet another place that an API for bypassing the query parser and speaking directly to the planner/executor would be an opportunity to try some new ideas.
“Yet another place that an API for bypassing the query parser and speaking directly to the planner/executor would be an opportunity to try some new ideas.”
+1
That’s probably how Muldis D would be the most tightly/efficiently integrated into Postgres, namely at a similar level to how Haskell would. But before that, probably the first practical implementation will be a Muldis D compiler written in Perl where the compiler converted some code to SQL for running inside the DBMS and some code to Perl for running on the client side. I liken it to how LLVM compiles targeting partially the GPU and partially the CPU.
Yes, I’d certainly like to see more alternatives in this area. A DBMS is seen as a monolith, but in reality there are a few fairly independent pieces that could be offered individually, and I think that could lead to a lot of interesting new ideas and development styles.
Giving a talk on this topic at PG WEST this week.
I love how side-effect free languages do such a good job of managing side effects otherwise known as databases.
Agreed.
Yes, ideas from languages “for theorists only” do have a way of showing up in mainstream contexts, although the source is often not recognized. Here is another example, from the book “Real World Haskell”: “Haskell’s parametric polymorphism directly influenced the design of the generic facilities of the Java and C# languages.”
Thanks for pointing out the interesting analogies between SQL and Haskell. I’d disagree, though (see example below), that SQL is as strict as Haskell is about types – few things are. What would true type strictness in SQL expressions do to (for?) the language?
mysql> select (“9foobar” = 9);
+——————+
| (“9foobar” = 9) |
+——————+
| 1 |
+——————+
1 row in set (0.00 sec)
The example you used is invalid per SQL spec, and if you’d like to see a better type system try PostgreSQL. MySQL is notoriously loose with types. I don’t disagree with your point though — there are some important weaknesses in SQL’s type system compared with that of haskell.
I believe that type strictness in SQL brings more confidence to the answers given by the system. What you don’t want is for your data to bring about unexpected edge cases, and the type constraints help a great deal with that (e.g. if the column is of type “integer”, you know you’re not going to get “9foobar”).
Unfortunately, NULL semantics really destroy this property of SQL because they introduce exactly the kind of edge cases that cause errors, and they propagate through the query making subtle errors more likely. For instance, consider the query:
SELECT dealer, sum(sales) FROM dealer_sales GROUP BY dealer HAVING sum(sales) < 100000;
. What that query should do is return under-performing dealers (those with low sales). But if a certain dealer no sales at all, the sum() returns NULL, and thus the dealer is not returned (because any comparison with NULL is NULL). However, the answer will look right — it’s just missing some of the dealers. That kind of subtle problem is something that haskell could easily catch through exhaustive pattern matching over a Maybe.[ I think that this thread has been closed for comments (which I have to do to avoid too much spam), but I'll see if I can figure out how to re-open it. ]