Portada

Effect of COVID-19 in liver cancer detection

Introduction

Covid-19 was a global pandemic which affects several fields in the world. Mainly, services such as transportation, food and health changed the way how they traditionally worked out to achieve the new safety standards.

Health system was oversaturated because it had to deal with COVID patients and normal ones. People also rather to stay in home than to go to the hospital. The risk of getting COVID was very high.

The lack of health attention likely increases and aggravate other diseases. The goal of this work was analyzed the influence of COVID-19 in the detection of liver cancer.

Data information:
  1. The data comes from the following Kaggle repository: https://www.kaggle.com/datasets/fedesoriano/covid19-effect-on-liver-cancer-prediction-dataset
  2. 27 Variables were considered in the analysis.
  3. Two main labels were available pre-pandemic and pandemic time.
Data Treatment Process
  • It was counted the variables with null values and the percentage of null in each variable.
  • The variables with more than 75% of null values were deleted for the analysis. They were considered like garbage data.
  • The remaining variables were filled out based if they were numerical or categorical. The numerical variables were filled out with the mean. The categorical variables were filled out with a K-Nearest Neighbors Imputer.
  • The categorical variables were transformed from text into numbers.
  • It was analyzed if they data set was unbalanced or not. Roughly 60% data from pre-pandemic time and 40% from pandemic time. It was considered like a balanced dataset.
  • It was determined the coefficient of linear correlation.
  • It was generated a scatter plot for all variables with a hue to discard the Simpson paradox. It was not obtained any pattern in the data.
  • It was carried out a one-dimensional analysis of the variables. It was analyzed the tumor’s size, presentation mode, the age of the patient, gender, an underlying cirrhosis disease, and a previous known cirrhosis.
  • It was carried out a bidimensional analysis. The pairs of variables analyzed were mode of presentation vs age, treatment_grps vs survival_fromMDM, etiology vs age, etiology vs size, size vs alive_dead, and size vs age.
Exploratory Data Analysis Results

One-dimensional Analysis

  • The percentage of symptomatic cases were more for pandemic time than pre-pandemic time. Likely, it was because people could not attend to prevention controls.
  • Mode of presentation
  • Covid-19 increases sightly the cases for male gender, and the country was for female gender.
  • Gender
  • The number of known cirrhosis were fewer for the pandemic time.
  • Known cirrhosis

Bidimensional Analysis

  • The size vs the age of the patient did not seem to follow any pattern.
  • Mode of presentation
  • The size of tumors is sightly important to liver or not. The median of tumor size for live is roughly 25 mm, but the tumor size for dead is under 50 mm.
  • Gender
Show code

Sebastián

Sarasti

Follow me on my social media channels to know more about my projects.

Follow Us

Get In Touch

Pujilí, Cotopaxi, Ecuador

sebitas.alejo@hotmail.com

© Sebastián Sarasti Zambonino. All Rights Reserved.

Designed by HTML Codex

Edited by Sebastián Sarasti and Angel Bastidas