Summary
Want to understand Data Science basics? This brief presentation covers the background behind the growth of the Data Science discipline, explains how this growth is affecting careers, gives an overview processes used, and provides a case study to illustrate the application of key principles
Presentation highlights include:
• The use of proprietary data software is declining, while that of open source software such as SAS, R and Python is increasing.
• A data scientist wears many hats and is responsible for:
o Keeping up-to-date on new theories/techniques
o Cleaning data
o Developing data hygiene processes
o Writing, debugging and rewriting code
o Sharing and presenting insights
• The goal of a data scientist is to generate a repeatable process that involves five key steps:
o Assessing data
o Extracting and cleaning data
o Choosing analytical tools
o Writing, testing and running reusable analysis code
o Identifying and building the client’s story
• A case study for a brand with a limited budget that seeks to drive sales and is not sure if a new internet campaign will help to lift in-store sales.
A helpful companion piece to this presentation:
Data Science Vocabulary