What is Navie Bayes Theorem?
Navie Bayes theorem:
Navie Bayes is a supervised classification algorithm. Which is based on bayes theorem with the assumption of independence among features, inorder to predict an categorie of an sample.
Naive Bayes is a simple technique for constructing classifiers.
It states that any freature in a class is independent of any other feature in the class.
The main assumption in navie bayes theorem is the features are independent and every feature has equal importance in a class.
They are probabilistic classifiers, therefore will calculate the probability of each category using Bayes theorem, and the category with the highest probability will be output.
Multinomial Naive Bayes: Feature vectors represent the frequencies with which certain events have been generated by a multinomial distribution. This is the event model typically used for document classification.
Bernoulli Naive Bayes: In the multivariate Bernoulli event model, features are independent booleans (binary variables) describing inputs. Like the multinomial model, this model is popular for document classification tasks, where binary term occurrence(i.e. a word occurs in a document or not) features are used rather than term frequencies(i.e. frequency of a word in the document).
Laplacian smoothing:
To avoid getting of a zero probability of an feature to occur, we will use smoothing.
Why we use log Probability?
log probabilities are used to bring numerical stability. So what happen is, since all probabilities lie between 0 and 1 which will result in multiplying numerous decimal values and rounding them off at each step,this will result in error in each step which when aggregated can be huge.
To avoid such a problem we use log of probabilities as it gives us stable numbers to compare the probabilities.
We will use laplace smoothing and then why log?
Log(0) is undefined so we have to do smoothing. Log is monotonically increasing function so if log(class0_prob) > log(class1_prob) then class0_prob > class1_prob. We will predict class as highest probability class.
What was use of log?
The use of log probabilities means representing probabilities in logarithmic space, instead of the standard [ 0 , 1 ] interval. The use of log probabilities improves numerical stability, when the probabilities are very small.¹
What is hyperparameter?
hyperparameter tuning is basically used to adjust the model hyperparameters so as to maintain bias-variance tradeoff i.e., underfitting or overfitting that is done using a cross validation dataset, that result in the most skillful predictions. For example, value of k in K-nearest neighbor classification and alpha in naive bayes.
Navie Bayes is a supervised classification algorithm. Which is based on bayes theorem with the assumption of independence among features, inorder to predict an categorie of an sample.
Naive Bayes is a simple technique for constructing classifiers.
It states that any freature in a class is independent of any other feature in the class.
The main assumption in navie bayes theorem is the features are independent and every feature has equal importance in a class.
They are probabilistic classifiers, therefore will calculate the probability of each category using Bayes theorem, and the category with the highest probability will be output.
Multinomial Naive Bayes: Feature vectors represent the frequencies with which certain events have been generated by a multinomial distribution. This is the event model typically used for document classification.
Bernoulli Naive Bayes: In the multivariate Bernoulli event model, features are independent booleans (binary variables) describing inputs. Like the multinomial model, this model is popular for document classification tasks, where binary term occurrence(i.e. a word occurs in a document or not) features are used rather than term frequencies(i.e. frequency of a word in the document).
Laplacian smoothing:
To avoid getting of a zero probability of an feature to occur, we will use smoothing.
Why we use log Probability?
log probabilities are used to bring numerical stability. So what happen is, since all probabilities lie between 0 and 1 which will result in multiplying numerous decimal values and rounding them off at each step,this will result in error in each step which when aggregated can be huge.
To avoid such a problem we use log of probabilities as it gives us stable numbers to compare the probabilities.
We will use laplace smoothing and then why log?
Log(0) is undefined so we have to do smoothing. Log is monotonically increasing function so if log(class0_prob) > log(class1_prob) then class0_prob > class1_prob. We will predict class as highest probability class.
What was use of log?
The use of log probabilities means representing probabilities in logarithmic space, instead of the standard [ 0 , 1 ] interval. The use of log probabilities improves numerical stability, when the probabilities are very small.¹
What is hyperparameter?
hyperparameter tuning is basically used to adjust the model hyperparameters so as to maintain bias-variance tradeoff i.e., underfitting or overfitting that is done using a cross validation dataset, that result in the most skillful predictions. For example, value of k in K-nearest neighbor classification and alpha in naive bayes.
Comments
Post a Comment