Naïve Bayes Machine Learning Classification with R Programming: A case study of binary data sets
International Journal on Orange Technologies (IJOT)
View Archive InfoField | Value | |
Title |
Naïve Bayes Machine Learning Classification with R Programming: A case study of binary data sets
|
|
Creator |
Yagyanath Rimal
|
|
Subject |
Naive Bayes Classifier
Supervised Learning |
|
Description |
This analytical review paper clearly explains Naïve Bayes machine learning techniques for simple probabilistic classification based on bayes theorem with the assumption of independence between the characteristics using r programming. Although there is large gap between which algorithm is suitable for data analysis when there was large categorical variable to be predict the value in research data. The model is trained in the training data set to make predictions on the test data sets for the implementation of the Naïve Bayes classification. The uniqueness of the technique is that gets new information and tries to make a better forecast by considering the new evidence when the input variable is of largely categorical in nature that is quite similar to how our human mind works while selecting proper judgement from various alternative of choices and can be applied in the neuronal network of the human brain does using r programming. Here researcher takes binary.csv data sets of 400 observations of 4 dependent attributes of educational data sets. Admit is dependent variable of gre, score gpa and rank of previous grade which ultimately determine whether student will be admitted or not for next program. Initially the gra and gpa variables has 0.36 percent significant in the association with rank categorical variable. The box plot and density plot demonstrate the data overlap between admitted and not admitted data sets. The naïve Bayes classification model classify the large data with 0.68 percent for not admitted where as 0.31 percent were admitted. The confusion matrix, and the prediction were calculated with 0.68 percent accuracy when 95 percent confidence interval. Similarly, the training accuracy is increased from 29 percent to 32 percent when naïve Bayes algorithm method as use kernel is equal to TRUE that ultimately decrease misclassification errors in the binary data sets.
|
|
Publisher |
Research Parks Publishing LLC
|
|
Date |
2019-11-29
|
|
Type |
info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion Peer-reviewed Article |
|
Format |
application/pdf
|
|
Identifier |
https://journals.researchparks.org/index.php/IJOT/article/view/358
|
|
Source |
International Journal on Orange Technologies; Vol. 1 No. 2 (2019): IJOT; 27-34
2615-8140 2615-7071 |
|
Language |
eng
|
|
Relation |
https://journals.researchparks.org/index.php/IJOT/article/view/358/347
|
|
Rights |
Copyright (c) 2020 International Journal on Orange Technologies
|
|