This project is my final project in a five-course certificate that I take from Coursera about advanced DataViz with R. I choose the US baby names for this project because the data set has a long temporal variable. I made several SQL queries in SQL workspace in BigQuery of Google Cloud Platform ...
We investigate a large data set of about 5.5 million data entries to determine the difference bike share usage of two different user groups of a bike-share company based in Chicago to help their marketing team to maximize their profits, namely their annual membership amount. The work is in R, using several important data techniques: combining, data extraction, descriptive statistics, and data visualization.
We improve Titanic survival rate prediction using data wrangling in this Python notebook. We also compare prediction performances of neural networks and other statistical methods.