What Does the Future Hold for DBAs?

By: Morpheus Data

If any IT job can be considered secure, you would think it would be that of a database administrator. After all, the U.S. Bureau of Labor Statistics Occupational Outlook Handbook forecasts an 11 percent increase in DBA employment from 2014 to 2024. That’s a faster rate than the average for all occupations, and just a tick below the 12 percent growth rate projected by the agency for all computer jobs.

Before any of you DBAs out there get too comfortable about your employment prospects, keep in mind that the database skillset continues the transformation triggered by the rise of cloud computing and database as a service (DBaaS). According to the BLS numbers, DBA employment at cloud service providers will grow by 26 percent in the decade ending in 2024.

The shifting emphasis to cloud databases hasn’t had a great impact on the skills companies look for in their new DBAs. While database administration is one of the top ten “hot skills” identified in Computerworld’s Forecast 2017 survey of in-demand tech skills, SQL programmers continue to be the most sought-after group. The survey found that 25 percent of companies plan to hire DBAs this year.

IT managers place database administration as one of the top 10 hot skills in 2017: 25 percent of the companies surveyed plan to hire a DBA in 2017. Source: Computerworld

One change noted in the 2017 survey is an increased focus on DBAs who “understand the user experience,” according to Michelle Beveridge, CIO for adventure travel firm Intrepid Group. DBAs need to look beyond data rules, mandatory input requirements, and data structures to consider first and foremost the business processes behind the data collection. Finding DBAs with these skills will continue to be a challenge, Beveridge states.

Slow pace of change in languages, tools benefits experienced DBAs, programmers

A look at the most recent DB-Engines ranking of DBMSs by popularity indicates the staying power of traditional relational databases: Oracle, MySQL, and Microsoft SQL Server continue to dominate the rankings, as they have for years. As of the June 2017 numbers, the PostgreSQL relational DBMS took over the fourth spot (year-to-year) from the MongoDB document store, while Cassandra was rated eighth (down one from the year-earlier ranking), and Redis was ninth (up one place from the June 2016 scores).

The continuing popularity of old favorites is also evident in the June 2017 RedMonk Programming Language Rankings, which are based on rankings from GitHub and Stack Overflow. JavaScript and Java have held down the top two spots since the inception of the rankings in 2012. Python, PHP, and C# have traded the third, fourth, and fifth positions on the list for almost as long. Another long-time favorite, C++, is holding steady in the sixth position, while Ruby dropped to eighth after peaking in the fourth spot in the third quarter of 2013.

The most upwardly mobile language on the list is Kotlin, which is ranked only 46th, but the language has moved up from number 65 in the December 2016 rankings. Most of the bump is attributed to Google’s decision in May 2017 to make Kotlin the company’s alternative to Swift, which is currently rated 11th. The language is expected to continue its climb up the rankings as more Android developers experiment with Kotlin as they create new apps.

Enterprise repositories find their way to the public cloud

Data warehouses represent one of the last bastions of in-house data centers. A new class of public cloud data repositories is challenging the belief that warehouses need to reside on the premises. TechTarget’s Trevor Jones writes in a June 8, 2017, article that services such as Amazon Redshift, Google Cloud Platform BigQuery, and Microsoft Azure SQL Data Warehouse provide greater abstraction and integration with related services. This makes it simpler for managers to explore the organization’s deep pools of data alongside traditional data warehouses.

Choosing a database depends in large part on the size of the data you need to accommodate – the more data you have, the more likely you’ll need a non-relational database. Source: Stephen Levin, via Segment

The goal of such services is to enhance business intelligence by tapping a range of cloud services hosting structured and unstructured data. However, the challenges in realizing this goal are formidable, particularly for enterprises. Much of a company’s existing structured data must be cleaned or rewritten for the transition to cloud platforms, and it isn’t unusual for enterprises to have workloads in several different cloud services.

One company in the process of transitioning to a cloud data warehouse is the New York Times, which previously built its own Hadoop cluster and used data warehouses from Informatica, Oracle, AWS, and other vendors. This setup left much of the company’s data “too siloed and too technical,” according to Jones. The Times is now transitioning to Google Cloud Platform as the sole receptacle for all of its warehoused data, primarily as a way to put powerful analytical tools in the hands of users.

Laying the groundwork for real-time analytics

A technology likely to have a great impact on DBAs in coming years is real-time analytics, which is also called streaming analytics. Dataversity defines stream processing as analyzing and acting on data in real time by applying continuous queries. Applications connect to external data sources, integrate analytics in the app “flow,” and update external databases with the processed information.

While descriptive, predictive, and prescriptive analytics perform batch analysis on historical data, streaming analytics evaluate and visualize data in real time. This facilitates operational decision-making related to business processes, transactions, and production. It also allows current and historical data to be reported at the same time, which makes it possible to display via a dashboard changes in transactional data sets in real time.

The components of a real-time streaming analytics architecture include real-time and historical data combined in an event engine from which real-time actions are taken and displayed on a dashboard. Source: Paul Stanton, via MarTech Advisor

Many obstacles must be overcome to realize the benefits of real-time analytics. For example, a great number of organizations rely on Hadoop for analyzing their large stores of historical data, but Hadoop can’t accommodate streaming, real-time data. Alternatives include MongoDB, Apache Flink, Apache Samza, Spark Streaming, and Storm. Also, real-time data flows are likely to overwhelm existing business processes, and the costs of faulty analysis increase exponentially.

The more insights you can gain from your organization’s data resources, and the faster those insights can be applied to business decisions, the more value you can squeeze out of your information systems. Putting practical business intelligence in the hands of managers when and where they need it is the reward that lets DBAs know they’re contributing directly to their company’s success.