1.2. Who can do data science?#

This course is truly for everyone: whether you’ve been programming for years or have never written a line of code in your life, whether you dream in statistics or have nothing but nightmares of your high school stats class, and – especially – whether you think of yourself as more science-minded or art-minded.

Why is it so important that data science be for everyone? For one, data science is all around us – from data visualizations in the news, to models predicting everything from pandemics to markets, to social media companies puting information in front of our faces, and to analyses and research informing decisions everywhere from corporate boardrooms, to governments, to local nonprofits, and more – it’s no longer possible to avoid data science and its implications in today’s world.

In addition, even if you are not involved in one of the vanishingly few fields with connections to data, it’s very likely that as you read this your data is being consumed, analyzed, and used by companies and organizations around the world. We believe it’s important for all of our empowerment that we gain an understanding of what they might be doing with it.

But a third, often overlooked reason, that we need everyone involved in data science is that it is a fundamentally interdisciplinary field. In order to use data to make discoveries, we need to do (much) more than just run data through fancy algorithms. We need, among other skills:

  • substantive knowledge about the world to know where to look and what questions to ask.

  • the ability to think like a scientist in order to design rigorous data science research projects.

  • mathematics and statistics training to estimate and interpret results from our analyses and apply them to the real world.

  • a philosophical and practical understanding of ethics in order to ethically conduct, apply, and share data science research.

  • creativity, imagination, and big ideas in order to develop novel ways to turn the world into data, conduct analyses, and prescribe actions and policies based on those results.

  • programming skills in order to develop, design, and even imagine algorithms to collect, analyze, and visualize data.

Data science is more than programming#

It is important to understand that programming is just one of the many skills required to do data science. A common misperception among those beginning data science – whether they’re students or practitioners – is that data science is strictly programming. While programming is an integral part of data science, it is far from the only part. While you will learn to program in this course, DS4E is not strictly a programming course. As you will see, there is far more to it (both this course and data science generally!).