Skip links

COVID-19 Project Data Science & Analytics

The Problem Statement - In this project, we explore the timeline of cases, deaths, and recoveries in each province in South Africa. Also, we want to know which age groups and genders are most affected by Covid 19. With this analysis we aim to highlight the need for South Africans to vaccinate and take the necessary steps to prevent the spread of Covid-19. In addition, we aim to predict the expected new cases and deaths for the next six months. We are using the South African Coronavirus (COVID-19) dataset available on the GitHub repository, created, maintained, and hosted by the Data Science for Social Impact research group, led by Dr. Vukosi Marivate, at the University of Pretoria

The Solution -We used the python programming language on Jupyter notebook and Power Bi business analytics service by Microsoft for creating a dashboard we then published to our website. We start off using python by exploring the data to get a high-level insight of how our data is structured. We extracted and cleaned the data and then plotted with histograms for visualizations. We dropped the rows that contained the null values. We then applied the linear regression model using the total as the target and the date as the feature to predict the timeline cases and deaths. After which we extracted the predictions for visualization. After that, we used Power Bi to make the dashboard interface for the user.

Client

Hackathon

Role

Data Science

Date

Jun 2023

Deliverables

Data Science & Analytics

Drag