This repository contains code extracts from a data analysis project I worked on as part of the Career Foundry Data Analytics Bootcamp.
This was the final project of the course and I chose to analyse 40 years' of tsunami and earthquake data (1985 - 2024). I was inspired by the recent media coverage of the 40-year anniversary of the 2004 Indian Ocean Earthquake and Tsunami, the deadliest natural disaster in recorded history, which killed an estimated 228,000 people across 15 countries. I wanted to understand what features of the earthquake and subsequent tsunami made this particular event so devastating.
You can view the corresponding Tableau Story online. This doesn't include all steps taken during the analysis, only those relevant to the final results.
This repository includes data from various sources, each referenced with the date of generation.
- Source: NOAA NGDC Tsunami Runup Database
- Date Generated: 8 Feb 2025
- Purpose: Contains data on tsunami runup features including location (latitude and longitude), maximum water height (m) and distance from the source (km).
- Source: NOAA NGDC Earthquake Database
- Date Generated: 28 Feb 2025
- Purpose: Contains data on earthquakes that preceeded a tsunami, including features such as location (latitude and longitude), depth (km) and magnitude.
- Source: GeoJSON Maps
- Date Generated: 17 Feb 2025
- Purpose: Contains geographic boundary data.
- Source: Japan National Tourism Organization (JNTO) Statistics
- Date Generated: 28 Feb 2025
- Purpose: Provides statistical data on Japan tourism trends including international arrivals.
The main purpose of this project was to further develop my knowledge of Python and as well as the end-to-end data analysis process.