Industry Outlook is a regular Data Center Journal Q&A series that presents expert views on market trends, technologies and other issues relevant to data centers and IT.
This week, Industry Outlook talks with Pierre Fricke about open-source databases and their role in the Internet of Things (IoT). Pierre has a long history in open-source software. He spent 10 years as director of product marketing for JBoss Middleware. He had joined JBoss Inc. just over a year before its acquisition by Red Hat in 2006 and stayed on until he joined EDB. Pierre first became involved in open-source software in 1998 during his 17 years at IBM. He played a critical role in establishing IBM’s Linux and open-source strategy, being one of seven team leaders whose contributions are still used today. He also spent five years as an industry analyst with an emphasis on Java and Microsoft application development and integration software.
Industry Outlook: How has the growth of the Internet of Things (IoT) affected how database administrators (DBAs) and IT professionals manage database infrastructure?
Pierre Fricke: The growth of the Internet of Things has caused an explosion of data worldwide. In fact, ABI Research is estimating there may be as much as 44 zettabytes of data by 2020—over fifteen times more data than had been created since the start of humanity through 2013!
As businesses analyze this data to generate intelligent business insight—and thus drive competitive advantage—they must also create new ways of managing it. Outdated infrastructure simply cannot handle the massive growth of big data. As a result, data centers today have become a complex patchwork of independent data-management technologies including relational databases, NoSQL solutions and specialized extensions that are added on ad hoc when needed. As a result, database professionals are now responsible for managing more data, more databases and more database solutions than ever before. New research from Dell (The Real World of the Database Administrator) has found that database administrators, more often than not, are responsible for more databases, with 72% of respondents saying that the number of databases they must manage is rising. In addition, 80% said they must support multiple apps, 45% said they have to support multiple user groups and 72% said their responsibilities are increasing.
IO: What are some of the main points DBAs should consider as they look to expand, scale or update their database infrastructure to handle the growing volumes of data created by the IoT?
PF: Enterprises are becoming more aware of their need to gather big data generated by the IoT to keep up with the pace of modern digital business. But although many organizations have spent inordinate amounts of time assessing possible analytical solutions that will enable them to draw insights from big data, many still may be overlooking the other IT tools needed to handle today’s onslaught of big data. This oversight includes the very system that can actually house and manage the data—the database.
When evaluating how to best augment their database to handle growing data volume, an initial reaction for many IT managers may be to buy more hardware and software, or to create clusters to manage and store the data. Taking these steps, however, can lead to an incredibly costly investment. Moreover, adding clusters or difficult-to-integrate technologies can also introduce data silos, which impede the organization’s ability to manage or obtain a holistic view of the data.
In short, DBAs should be careful to understand the back-end financial impact of their IT infrastructure, as well as the potential data-handling difficulties such a system may cause.
IO: Many misconceptions about open-source database technology still exist. Why should DBAs consider open-source alternatives to proprietary solutions when augmenting their IT infrastructure for the IoT?
PF: DBAs should consider open-source-based solutions because they have the same functions, scalability and reliability as propriety systems. As noted in Gartner’s April 2015 report State of Relational Open Source RDBMSs 2015, “Open-source RDBMSs have matured and today can be considered by information leaders, DBAs and application development management as a standard infrastructure choice for a large majority of new enterprise applications.” The report also said, “Information leaders who opt for an open-source DBMS (OSDBMS) licensing model can benefit from much lower costs than using a commercial model, even with today’s hosted cloud and database platform as a service (dbPaaS) offerings.” With these benefits in mind, database administrators and other IT managers would do well to choose open-source solutions when augmenting their IT infrastructure for the IoT.
IO: Why are open-source database solutions well suited to meet the growing IT demands that result from the IoT?
PF: As mentioned previously, open-source databases have the performance, security, reliability and scalability that database administrators need to handle the growing demands of the IoT—and at lower costs. In addition, Postgres, specifically, has capabilities well suited to the growing data needs of modern businesses.
When implementing a data center infrastructure strategy with the IoT in mind, a wide variety of structured, semistructured and unstructured data types must be considered. To handle these varying data types, database administrators may spin off additional clusters or to NoSQL-only solutions, which can oftentimes result in the creation of data silos. These silos, in turn, increase costs and make the data difficult to manage.
Postgres was developed to be expandable and to easily incorporate new data types, indexing schemes and languages without compromising other features of the database. This capability enables DBAs and CIOs to centralize and scale company data—as demanded by the IoT—with greater cost savings than proprietary databases.
Postgres boasts a unique feature called a Foreign Data Wrapper (FDW), which allows DBAs to seamlessly integrate data from disparate sources, including NoSQL and Hadoop databases, into a common model under the Postgres database. FDWs are able to link the Postgres databases to external data stores so DBAs can access, manage and manipulate data from foreign sources as if it were part of the native Postgres database. This capability enables DBAs to not only easily manage their data but also ensure the data’s integrity.
IO: Should DBAs be concerned about the security of their open-source databases?
PF: No. “Open source” does not equate to “less secure.” Enterprise open-source solutions such as EDB Postgres boast the same level of security as traditional solutions, including enhanced auditing, row-level security, SQL-injection-attack guard and other capabilities. In addition, better-managed open-source solutions also have fewer vulnerabilities than commercial products owing to the strict reviews and testing process that these types of systems must undergo. Furthermore, the inherent nature of open source—in which the code kernel is available to a large community of developers—means more individuals are looking for potential bugs and problems (an open process that is often prohibited in propriety systems).
IO: Are open-source databases truly more cost effective?
PF: Absolutely. Compared apples to apples, open-source solutions have been found to provide much greater return on investment. First, open-source solutions, especially commercially supported versions, cost much less than traditional proprietary solutions and less than roll-your-own open-source versions. This cost advantage is because the open-source-based pricing model starts at a much lower price, and vendors like EDB offer subscription-based pricing. Second, the community of end users has grown considerably, so the cost of finding expertise has come down in recent years. Third, the tools and resources for open-source database-management systems are as readily available as those for traditional relational DBMSs. Lastly, and most importantly, open-source solutions have evolved over the past decade to perform at parity with commercial systems. Thus, open-source systems not only cost less initially but also perform as well as proprietary systems at lower expenditures over time.