Sebastián Sarasti - Data Scientist

Data information:

The data extracted was depurated and saved only relevant information in an unstructured way. In this case, it was used JSON files for each candidate.
The JSON files were used to calculate metrics and then the results were saved in csv files.

Metrics used

Interaction Index (II) measures how popular is a publication. This metric mixes up the number of comments, shares, and likes in a publication.
Positivity (POS). This metric measures the positivity in comments left in the posts.
Negativity (NEG). This metric measures the negativity in comments left in the posts.
Neutrality (NEU). This metric measures the neutrality in comments left in the posts.
Words frequency. This metric shows the most used words in texts.
Bigram frequency. This metric shows the most phrases of two words in texts.
Trigram frequency. This metric shows the most phrases of three words in texts.
Collocations. This metrics shows the phrases or words most used in texts.

Results

The following plot shows the popularity evolving for each candidate. It is pointed out that C-2 is not as popular as C-1 and C-3 candidates.

The candidate C-3 and C-1 have similar levels of popularity. The last month, October, C-3 was roughlt 20% more popular than C-1.

The following shows the mean positive sentiment over the time.

The C-1 shows high levels of positivity, it might be that C-1 uses bots to have a good image in social media.

Based on the previous results, it was analyzed how positivity is distribuited along the Octobe month for C-1 and C-3.

The probability densitiy function shows that C-1 has almost all values in high levels of positivity. It is not plausible because a normal feeling from people is also negativity.

To support the previous idea, it was plotted the distribution of negativity.

The probability densitiy function shows that C-1 has almost all values in extremely low levels of negativity for C-1, while for C-3 negativity is more uniformly distribuited.

Conclusions:

The current models were able to generate at some positions a good fitting between the real and predicted spectra. However, and other positions both spectra do not match at any region.
It is a first step trying to predict a radiation detector response, a bigger architecture can be a solution with a bigger project and more resources.

From the Natural Language Processing, the results shows that C-1 and C-2 only have positive comments, so the audience does not seem to be natural.

The results for the C-3 show that using phrases such as new people, new generation, and god increases the popularity. However, it is several phrases such as buying people and crazy richs that show people distrust.

Public code is not available.

Improving Marketing for Politic Campaigns

Introduction

Data information:

Metrics used

Results

Conclusions:

Sebastián

Sarasti

Follow Us

Get In Touch

Quick Links