Return to page WIKI

Naive Bayes

What are Naive Bayes Classifiers?

Naive Bayes classifiers are an assortment of simple and powerful classification algorithms based on Bayes Theorem. They are recommended as a first approach to classify complicated datasets before more refined classifiers are used. 

Bayes Theorem is a collection of algorithms that share a common principle. With Bayes theorem, users find the likelihood of A happening, given that B transpired. In the equation provided below, B is the evidence and A is the hypothesis. The fundamental assumption of Bayes Theorem is that predictors are independent. In other words, the existence of one predictor will not influence the other. 


P(A|B) = P(B|A)P(A) / P(B)


Assumptions of Naive Bayes Classifiers

Independent or Equal

Naive Bayes classifiers assume that each feature makes an independent and equal contribution to the outcome. The first portion of the assumption is that no pair of features is dependent, making it independent. An example is if the outside humidity is high, that does not mean the outside temperature is also high. The second portion of the assumption is that each predictor has equal importance. For example, when deciding to play golf, a windy day is equally important as temperature. 


Types of Naïve Bayes Classifiers

Multinomial Naive Bayes Classifiers

Common in Natural Language Processing (NLP), multinomial Naive Bayes classifiers infer the tag of text, calculate the probability for a given sample, and output the tag with the greatest probability. Multinomial Naive Bayes classifiers use the frequency of words in a document as features/predictors. They are typically used in document classification.

Because a multinomial Naive Bayes classifier only calculates probability, it is easy to implement. Furthermore it efficiently handles large datasets and is very scalable. Therefore, it is useful in evaluating real-time applications.

Compared with other probability algorithms, multinomial Naive Bayes have lower prediction accuracy and are unsuitable for regression analysis. Therefore this technique is inappropriate for estimating numerical values and should only be used to classify text input.


Bernoulli Naive Bayes Classifiers

Unlike multinomial Naive Bayes classifiers, Bernoulli Naive Bayes classifiers use binary (boolean) variables such as yes or no, true or false, etc. The Bernoulli Naive Bayes classifier is used for document classification.

Compared to other Naive Bayes classifiers, Bernoulli Naive Bayes is a fast classifying algorithm that works well with small datasets, delivers accurate results, and can easily handle irrelevant features.

Gaussian Naive Bayes Classifiers 

Gaussian Naive Bayes classifiers work with continuous data and the assumption that the values associated with different classes are distributed in accordance to a normal (or Gaussian) distribution. This classifier provides accuracy without excessive effort and is an efficient and user friendly technique to implement.



Naive Bayes algorithms are most commonly used for text classification. There are differences within these algorithms, but each is simple and efficient. While each algorithm would need training data to approximate the parameters needed for evaluation, the Naive Bayes algorithm can give required data quicker than more sophisticated methods, making them valuable in real-world situations.


Naive Bayes Resources

H2O Docs: Naive Bayes