PREDICTION BY USING RAPID MINER

Rapid Miner

In this blog i will use rapid miner as my prediction by using data pemilu dataset.
RapidMiner is a software platform developed by the company of the same name that provides an integrated environment for machine learning, data mining, text mining, predictive analytics and business analytics.
I will use three main algorithms, which are; Decision Tree (C4.5), Naïve Bayes (NB) and K-Nearest Neighbor (K-NN).

1. Decision Tree

 The Decision Tree
the decision tree description

the performance vector


 So, Decision Tree have an accuracy 96.28% with predicition TIDAK true TIDAK is 362 and true YA is 14, and for the prediction YA true TIDAK is 15 and true YA is 34.

2.NAIVE BAYES (NB)

 A Naive Bayes classifier is a simple probabilistic classifier based on applying Bayes’ theorem (from Bayesian statistics) with strong (naive) independence assumptions. A more descriptive term for the underlying probability model would be ‘independent feature model’. In simple terms, a Naive Bayes classifier assumes that the presence (or absence) of a particular feature of a class (i.e. attribute) is unrelated to the presence (or absence) of any other feature.

The naive bayes simple distribution
the accuracy of naive bayes
So, Naive Bayes have an accuracy 94.80% with the prediction TIDAK true TIDAK is 310 and true YA is 17, for the prediction YA true TIDAK is 67and true YA is 31.


3.K- NEAREST NEIGHBOR (K-NN)

K-Nearest Neighbour model is to generates from the input ExampleSet, this model can be a classification or regression model depending on the input ExampleSet. The k-Nearest Neighbor algorithm is based on learning by analogy, that is, by comparing a given test example with training examples that are similar to it. The training examples are described by n attributes. Each example represents a point in an n-dimensional space.

the accuracy of k-nn


So, K-NN have an accuracy 93.47% with prediction TIDAK true TIDAK is 358 and true YA is 25 and for the prediction YA true TIDAK 19 true YA is 23.

conclusion
From the three modeling method above the highest accuracy is the Decision Tree with 96.28%, and then for the second is naive bayes with 94.80% for the lowest accuracy point is K-NN with 93.47%.
 in my opinion Decision Tree is almost approaching the truth and i prefer to be use it  because its more easier to analyze and easy to read the results.

Komentar