Return to page

H2O.ai WIKI

Recommendation system

What is a recommendation system?

Recommendation systems (RS) are tools that enable artificial intelligence to make product suggestions for users.This is accomplished by analyzing data commonly derived through:

  • Individual product interaction

  • Group product interactions

  • Product facts

AI leverages this data to predict user decisions and preferences, referred to as a “recommendation engine”.  The recommendation engine that the RS established, filters data inputs to produce optimal user direction.


How does a recommendation system work?

There are four phases recommendation systems move through:

Extraction

The extraction phase consists of accumulating target data. Depending on the project, useful data may include user clicks, likes, saves, ratings, shares, comments, and view-time. This phase will determine the type of recommendation system needed, based on the data necessary for the optimal machine learning model result.

Storage

AI models and RS tools require a large amount of data to function. To control this complex data,  RS will need a high-performance storage system. The most basic storage databases are Structured Query Language (SQL) and Non-SQL (NoSQL). SQL communicates with rational databases while NoSQL communicates with non-rational databases. A more accessible storage system for data analysis is a cloud data warehouse. Famous businesses like Amazon, Microsoft, and Google all cloud data warehouse systems available.

Data Analysis

There are various methods of examining the data depending on AI model needs. Some commonly used examination methods include:

Near real-time 

The process that is considered the most optimal choice is near real-time. Near real-time quickly intakes data, methodically identifies trends, and produces a well-formulated output. The data processing can take a few minutes and is best for products that need more accurate and secure outputs while still maintaining speed. Since it has the capability to pin-point data patterns in minutes, near real-time is often used for IT system security. This method can recognize a possible security threat, process the issue to confirm it’s a threat, and quickly deploy a security breach notification.

Real-time

Real-time examination is optimal for recommendation systems that need to distribute immediate output. A real-time analysis processes data immediately after it has been identified. Real-time analysis is frequently utilized for GPS tracking. An AI GPS tracker needs to produce a location as soon as the data input is collected.

Batch analysis

The slowest, yet still highly valuable, data processor is batch analysis. While near-real-time takes minutes to process data, and real-time takes seconds, batch analysis can possibly take days to process. Most tools that use this data analysis method are timed, consistent outputs.  An example would include recurring payment from a credit card. Setting up a recurring payment stores data from the card, the amount due, the bank it’s being transferred to and from, and any other authorizations. It will then process that data history each month and pay the balance.

Filtration

The filtration phase organizes and implements the examined data into recommendations for the user. The filtration system is the scope of the recommendation system. When referring to the kind of recommendation system, most use the classification of its filtration.
 

Examples of recommendation systems

Recommendation systems are the filtering tools used to sort and distribute data. Among others, there are content based, collaborative based, cluster based, and knowledge based filters.

Content based

Content based filtering consumes ratings, reviews, likes, and other content produced by user engagement. Once it filters through the data, the machine learning model introduces predictions customized to estimated user needs or interests. This filter is highly adaptable to new online users, environments, and products. This adaptability helps in producing recommendations with minimal data.

Collaborative based

 Collaborative based filtering uses the activity from multiple users that consume similar content. Once it identifies a pattern between users, it identifies additional content that historically engages users inside their pattern. 

Cluster based

Cluster based filters match pieces of data and collect them into datasets. Each dataset is assigned an ID to simplify complex data management. Processing larger quantities of complex data while maintaining scalability improves the dependability, range, and flow of recommendations. Cluster based RS is utilized in the eCommerce industry. For example, a book review website will use a cluster based filter to process the massive quantity of comments and ratings down to personalized book recommendations.

Knowledge based

Knowledge based filtration focuses on product information and related items. The RS identifies products of interest to a user based on their previous engagemet. For example, if someone purchases a notebook on Amazon and scrolls down the page, a knowledge based RS could recommend pens for purchase. 


Why are recommendation systems important?

Recommendation systems represent major advancements in service customization. Industries like marketing, telecommunications, and retail are able to influence larger audiences through  recommender systems. Content based, collaborative based, and knowledge based RS enable users with engaging auto-customization and content placement. RS empowers ML models with thorough understanding of the wants and needs of a given userbase. 

Benefits of recommendation systems

AI using RS tools can also:

  • Advertise upcoming content to target demographics
  • Persuade user choice
  • Increase product importance to search engines
  • Promote longer user engagement