Dr. Michael Stonebraker's 10 fears about the future of the DBMS field
The performance of Db2 depends on many things. It spreads from the application and the SQL, through the Db2 itself, down to the underlying system configuration. We can even expand it a bit more, I mean a way bit more down to the fundamentals. It all started with the base research of dabatase management systems (DBMS) and the research is still at the very bottom of everything and kinda defines the future trends and directions. What if we take a look at the current DBMS field research performance?
Time to time I try to follow the development news from the DBMS field research. Partially, because I have some academic roots, although not in databases, and so I still have some sentiment about the researcher's work, but also because it shows current trends in DBMS, which will likely turn into future features. Db2 has been part of this field since the very beginning and it was Db2, where the very initial ideas were implemented and defined the original trends of DBMS. Many of you are most likely aware of the System R that followed the research from Dr. Ted Codd, which positioned the foundations of what we now call relational database management systems. Therefore, I think that the Db2 community like IDUG should be aware of current trends in the research fields, even though the industry is a bit different and has its own specifics.
Recently, a friend of mine sent me a link to a presentation by Dr. Michael Stonebraker about his fears about the future of the DBMS field. For those, who haven't heard about Michael Stonebraker, he is a computer scientist specializing in database systems, who formed the basics of some well known systems like Ingres and Postgres, which influenced other systems as well. He has been recently awarded the Turing Award.
You can find his lecture I will be referring to on YouTube on this link. I encourage you to listen to it despite the fact it is more related to academia than business.
So what are Dr. Stonebrakers' fears? In short, he mentioned the following items:
- The Hollow Middle
The core research is no longer that active as it was in the seventies and later. Now drifting more to the applications, which leads to multi-furcation.
- We have been abandoned by our customer
Sounds like the main problems that were addressed by database management systems are solved already. But are they? No. However, sometimes there is a disconnect between the reality and theoretical research. In some areas, industries no longer participate in research conferences because they are not relevant to their needs any longer.
- Diarrhea of papers
Basically, too many published papers, often on very easy topics. Students and researchers may feel the pressure to publish more and more articles, which leads to more shallow topics.
- Reviewing is getting very random
It is not a surprise that the previous point causes the reviews of papers being more random than thorough and detailed.
- Research taste has disappeared
The research topics sometimes follow the research areas of a few big companies and the community is uncritical to these topics, even if they may reinvent the wheel.
- We are polishing a round ball
Improving an algorithm a little bit may sound like a good idea, but in the real world the implementation and other factors may reduce tiny enhancements and their benefits, So, instead of polishing a round ball to be more shiny, we should be focusing on stuff that matters.
- Irrelevant theory is taking over
More on this later, but this is related to the diarrhea of papers. In other words, focusing on irrelevant, sometimes easy, stuff rather than complex real world topics.
- We are ignoring the most important problems
An almost obvious follow up to the previous item. If we are focusing on irrelevant stuff, we most likely ignore the most important problems, which are usually by their nature very complex.
- Research support is disappearing
This is sad, but I see this even on my alma mater. The research support is disappearing and it is hard to focus on core research, if you need to spend non-trivial effort with managing your work and comply with all the bureaucracy.
- Student load
Too many students in some fields, while lack of students in some others, which can make the academic life less attractive. Best people therefore often leave the academia.
Now, I don't plan to go through these items myself and rewording Dr. Stonebraker's thoughts, better if you listen to the lecture yourself. I would just add my comments and experience to some of the points.
We have been abandoned by our customers
I see this even in our industry sometimes. Many customers outsourced they key infrastructure to reduce the costs or try to save the money by reducing the workforce working on the core technologies. Moreover, many problems are now considered as proprietary and as such they cannot be shared in public. While I can understand it in some cases, for others this can be a little sad, because most probably other people deal with very similar problems.
Translated this to the IDUG perspective, if you are still with us, please share your knowledge, speak about your problems your need to solve, and discuss your daily jobs. Believe me that we (I mean the community, but also vendors together with IBM) are listening and trying to help you to solve the real life problems. The disconnection with the users does not help anyone.
Irrelevant theory and ignoring the most important problems
Do you remember the hype about Big data and related technologies? I don't want to make any general statement nor over simplify the situation, but the thing is that the core researchers warned several years ago that for example technologies like Hadoop have very limited scope and will not have long life. Yet many companies tried to adopt it with a more or less success, while the technologies already shifted beyond that.
On the other hand we sometimes still ignore the key important topics, like data integration. Most probably, because they are basically too complex and hard to solve - or in Stonebraker's words - we are ignoring the important problems in favor of ones that are easy to solve. Michael Stonebraker lists the following examples of most ignored problems:
- data integration
- database design and evolution
- tuning a DBMS application is way too difficult
- average DBMS takes $20M to get to production readiness
I personally believe that these are the topics most of you spend big effort with, especially, like I've already mentioned data integration , and this month's IDUG topic - application and system performance. Or is there anyone, who does not try solve these things?
There is one more thing - learning from history. In order to stay current and relevant, we need to understand the history and if possible do not repeat the mistakes, but take the lessons. For example, from a completely different presentation and another article, which I may discuss next time, I've understood that hierarchical databases are becoming attractive. I have nothing against that, but we need to understand that this is nothing very new. Remembering IMS? Yes, a great technology, a hierarchical database that can teach us important lessons.
I would sum up several of his ideas into a general common problems in academics. I still meet my friends from academia and the problems I've heard in this presentation do not apply to US only. I see this as general trend even here in Czech Republic and I believe it is kinda global, hopefully with some exceptions. The diarrhea of papers, that are not relevant to the real world problems, the lack of papers focusing on core technologies, the students load and all these aforementioned problems.
Can our IDUG community help with this? I am sure we can, if we stay focused and speak about the real problems we are solving in public forums so that they become attractive the same way as other buzz word resonating areas and technologies.
What were my takeouts from Dr. Stonebraker's lecture? Keep focusing on core technology, try to stay connected, focus on real life's problems and don't follow the hype. What about you?
Maybe, this article sounds a bit off topic from the IDUG perspective, but isn't the whole infrastructure and ecosystem about the performance?