Return to page

H2O.ai WIKI

Big Data

What is Big Data?

Big data refers to the volume, velocity, and variety of data that artificial intelligence technologies are using to discover patterns and correlations hidden in massive collections of data. Big data is also commonly known as the three Vs.

Big data consists of complex data sets often from new sources. Nearly 2.5 quintillion bytes of data are created each day and when used effectively it can be an effective asset for organizational decision-making.

Examples of Big Data

Big data is often used in products or services you use every day. Here are two examples of how big data is used to improve products and services.

Product Development

Companies like Netflix or Procter & Gamble leverage big data to anticipate market demand. Predictive models are built for new products and services by classifying key attributes of past products then modeling the commercial success of those offerings.

Big data enables companies to gather data from website visits, social media interactions, and ads you click on. The data is then used to make improvements to the customer experience in ways such as delivering personalized offerings in hopes to reduce customer churn.

Why is Big Data Important?

Big data allows companies to implement data-driven decision-making rather than having to rely on feelings or subject experiences. When used effectively, big data gives companies competitive advantages in the marketplace.

Big Data FAQs

What are the 3 types of big data?

The three types of big data are classified as structured data, unstructured data, and semi-structured data.

What is big data used for?

Big data is often used in products or services you use every day. Here are application examples of where big data is being used:

What qualifies as big data?

Big data is data that is so large, fast, or complex that it's difficult or impossible to process using traditional methods.

H2O.ai and Big Data: H2O is an open source, in-memory, distributed, fast, and scalable machine learning and predictive analytics platform that allows you to build machine learning models on big data and provides easy productionalization of those models in an enterprise environment.

The speed, quality, ease-of-use and model-deployment for the various cutting edge Supervised and Unsupervised algorithms like Deep Learning, Tree Ensembles, and GLRM make H2O a highly sought after API for big data data science.

Read more

Big Data vs Other Technologies & Methodologies

Big data vs data mining

Big data is a term that refers to a large amount of data whereas data mining refers to a deep dive into the data to extract the key knowledge/pattern/information from a small or large amount of data.

Big data vs data science vs data analytics

Big data refers to a large and complex collection of data. Data analytics is the process of extracting meaningful information from data. Data science is a multidisciplinary field that aims to produce broader insights.

Big data vs data warehouse

Big Data is a term applied to datasets whose size is beyond the ability of commonly used tools to capture, manage, and process the data within an acceptable elapsed time. Data-warehouse is a collection of data marts representing historical data from different operations in the company.