The project is based on Udacity’s Udacity’s Data Scientist Nanodegree program.
Udacity’s Sparkify is a virtual company such as the other music streaming services Spotify or Google Music. Udacity has provided the dataset that contains a customer behavior log from October to November 2018. Customer log holds time-based information (Unix time seconds since 1970) of every activity that customer has made e.g. registration day, length of sessions, and page visited (the main information of customer behavior).
Have you ever got stuck when paid virtual clusters are running and you’ll try to diagnose where the cause is? After reading this blog you’ll find some tips to avoid costs.
The main reason for writing this blog was to show the results I got from the user churn assignment in Udacity’s Data Science Nanodegree program but during the process, I ran into practical issues, and therefore I’d like to share some tips for reducing costs when using virtual clusters.
The assignment was to build a model to predict customer churn for Udacity’s virtual company Sparkify, which is like the…
When you are solving what is the average salary for a developer type you’ll cross with this observation: Dataset has 23 different developer categories but this is a multiple-choice question so respondents can choose all that applies. This means for example that one respondent can have many developer types listed. There were 8269 different answer types (23 categories + 8264 different combinations of those categories). So how can you exactly tell, which salary is meant for each developer type?
Stack Overflow makes an Annual Developer Survey to their developer community. In 2020 with nearly 65,000 responses fielded from over 180…