Use cases#

This section puts BDV into perspective, so that prospective administrators and end users know what to expect from BDV.

What BDV is not#

Since it’s natural to think of BDV as a database by many users, it makes sense to begin with a definition of what BDV is not.

Do not mistake the fact that BDV understands SQL with it providing the features of a standard database. BDV is not a general-purpose relational database. It is not a replacement for databases like MySQL, PostgreSQL or Oracle. BDV was not designed to handle Online Transaction Processing (OLTP). This is also true for many other databases designed and optimized for data warehousing or analytics.

What BDV is#

BDV is a tool designed to efficiently query vast amounts of data using distributed queries. If you work with terabytes or petabytes of data, you are likely using tools that interact with Hadoop and HDFS. BDV was designed as an alternative to tools that query HDFS using pipelines of MapReduce jobs, such as Hive or Pig, but BDV is not limited to accessing HDFS. BDV can be and has been extended to operate over different kinds of data sources, including traditional relational databases and other data sources such as Cassandra.

BDV was designed to handle data warehousing and analytics: data analysis, aggregating large amounts of data and producing reports. These workloads are often classified as Online Analytical Processing (OLAP).