Classi Cation Performance Analysis

In this section, we describe the experimental setup of our several runs.
Baseline Run: At rst, we used the training dataset of 1.6 million tweets of sentiment140 dataset to train the Naive-Bayes classier for classifying the sentiment of the test set, which we considered as our baseline. In this run, we just use the bag-of-words (BoW) feature and did not perform any text preprocessing task. Run1: At Run1, we consider the similar kind of setup like Baseline Run. But here we incorporate our text preprocessing strategy to improve the classication result. Run2: At Run2, we trained the Naive-Bayes classier with several sentiment lexicons instead of using large training dataset of 1:6 million tweets. We also incorporate our text preprocessing …show more content…
The strategy is that at rst our rule-based classier is applied to classify the tweets sentiment as positive, negative or unknown. As, our goal is to classify the tweets sentiment only positive or negative class. So, for the tweets that are labeled as unknown by the rule-based classi er, we consider the predictions of Naive-Bayes classier as the nal labels.
Run4: At this run, we combined our proposed rule-based classier with the setup of Run2, to improve the classication performance. Here, we use the similar kind of combination strategy already described in the experimental setup of Run3.
Run5: At this run, we used the training dataset of 1:6 million tweets of sentiment140 dataset to train the Multiclass SVM classier from cornell university [67] for classifying the sentiment of the test set. In this run, we use our preprocessing strategy and the bag-of-words (BoW) feature. For feature weighting, we use the
TF-IDF weighting …show more content…
Run8: At Run8, we trained the weka's [69] multinomial Naive-Bayes classi- er with our selected 35 features. Then, we combined our proposed rule-based classi er to improve the classication performance. Here, we also use the similar kind of combination strategy already described in the experimental setup of
Run3.
Run9: At Run9, at rst our rule-based classier is applied to classify the tweet sentiment as positive, negative or unknown. For the tweets that are labeled as unknown by the rule-based classier, we consider the majority voting count based predictions from several classiers as stated below:
Probabilistic Naive Bayes Classier: Trained with the sentiment140 dataset.
(BoW feature)
Probabilistic Naive Bayes Classier: Trained with our combined sentiment lexicons. (BoW feature)
Probabilistic Multiclass SVM Classier: Trained with the sentiment140 dataset. (BoW feature with TF-IDF weighting scheme)
Probabilistic SMO Classier: Trained with the sentiment140 dataset. (Selected
35 features)
Probabilistic Naive Bayes Multinomial Classier: Trained with the sentiment140 dataset. (Selected 35

Related Documents

Nt1310 Unit 3 Lab

Nt1310 Unit 3 Lab

Edl 690 Unit 6 Paper

Edl 690 Unit 6 Paper

Wac 107 Class Analysis

Wac 107 Class Analysis

A Class Divided Summary

A Class Divided Summary

Eng 309 Class Analysis

Eng 309 Class Analysis

Peer Review Critique

Peer Review Critique

Compassion Fatigue Theory Paper

Compassion Fatigue Theory Paper

Behavior Quiz

Behavior Quiz

Radiology Journal Review Essay

Radiology Journal Review Essay

How Did Rorschach Predict Suicide

How Did Rorschach Predict Suicide

D. A. R. E Class Analysis

D. A. R. E Class Analysis

GEOG 306 Literature Review

GEOG 306 Literature Review

Measuring Racial Disparity In Child Welfare

Measuring Racial Disparity In Child Welfare

Ethical Issues In Journalism

Ethical Issues In Journalism

Unit 3 Holiday Homework

Unit 3 Holiday Homework

Related Topics

Ready To Get Started?

Discover

Company

Follow