NLTK

Improving Marketing for Politic Campaigns

Introduction

Data-driven decision making is a way to improve organizations and save money. Many politics in developing countries do not consider the importance of data available in social media. Social media provides information of how people are thinking about a candidate. This data helps to improve the public image of the candidate and gain more votes.

This work used an API from the website Apify. The API used let to extract information from Facebook. The data extracted were the post descriptions, likes, comments, and shares. Three candidates for a mayoral position were evaluated from April to November of 2022. The identities are going to be hidden in all the analysis to protect client’s privacy.

The goal was identify outliers of high popularity, positivity and negativity. The results proofs what already worked out in the campaing and what did not.

Data information:
  1. The data extracted was depurated and saved only relevant information in an unstructured way. In this case, it was used JSON files for each candidate.
  2. The JSON files were used to calculate metrics and then the results were saved in csv files.
Metrics used
  • Interaction Index (II) measures how popular is a publication. This metric mixes up the number of comments, shares, and likes in a publication.
  • Positivity (POS). This metric measures the positivity in comments left in the posts.
  • Negativity (NEG). This metric measures the negativity in comments left in the posts.
  • Neutrality (NEU). This metric measures the neutrality in comments left in the posts.
  • Words frequency. This metric shows the most used words in texts.
  • Bigram frequency. This metric shows the most phrases of two words in texts.
  • Trigram frequency. This metric shows the most phrases of three words in texts.
  • Collocations. This metrics shows the phrases or words most used in texts.
Results

    The following plot shows the popularity evolving for each candidate. It is pointed out that C-2 is not as popular as C-1 and C-3 candidates.

    The candidate C-3 and C-1 have similar levels of popularity. The last month, October, C-3 was roughlt 20% more popular than C-1.

    Popularity over the time

    The following shows the mean positive sentiment over the time.

    The C-1 shows high levels of positivity, it might be that C-1 uses bots to have a good image in social media.

    Positivity over the time

    Based on the previous results, it was analyzed how positivity is distribuited along the Octobe month for C-1 and C-3.

    The probability densitiy function shows that C-1 has almost all values in high levels of positivity. It is not plausible because a normal feeling from people is also negativity.

    Distribution of posivity during October

    To support the previous idea, it was plotted the distribution of negativity.

    The probability densitiy function shows that C-1 has almost all values in extremely low levels of negativity for C-1, while for C-3 negativity is more uniformly distribuited.

    Distribution of posivity during October
Conclusions:
  • The current models were able to generate at some positions a good fitting between the real and predicted spectra. However, and other positions both spectra do not match at any region.
  • It is a first step trying to predict a radiation detector response, a bigger architecture can be a solution with a bigger project and more resources.

From the Natural Language Processing, the results shows that C-1 and C-2 only have positive comments, so the audience does not seem to be natural.

The results for the C-3 show that using phrases such as new people, new generation, and god increases the popularity. However, it is several phrases such as buying people and crazy richs that show people distrust.

Public code is not available.

Sebastián

Sarasti

Follow me on my social media channels to know more about my projects.

Follow Us

Get In Touch

Pujilí, Cotopaxi, Ecuador

sebitas.alejo@hotmail.com

© Sebastián Sarasti Zambonino. All Rights Reserved.

Designed by HTML Codex

Edited by Sebastián Sarasti and Angel Bastidas