The most important skill

This can be modeled with the equation log(ExamScore) = 3.75(log(NumberOfStudents))-0.02(SEDPercent).

In order to create that fact table, with that grain, we’ll need 243kb of storage per record.

The way the MEDIAN function is implemented in this software, the execution time of the process grows exponentially.

For practitioners in the various domains that these statements come from, they’re all easily interpretable and easy to understand. For those outside of those domains, each one is pretty impenetrable. Worse than being hard to understand, each of these statements leaves almost as much important material unsaid as said.

If clear communication is the most important skill in business, than the most important skill for a data scientist – or arguably any scientist – is the ability to take complex topics and reduce them into material that is easy for a layman, can understand.  Even more, these people must be able to understand you well enough to take effective and timely action based on the information that you are relaying to them. Continue reading “The most important skill”

What is data science?

The first question a blog like this needs to answer is really, “What is data science?” This is particularly important because there are a lot of definitions out there, and the community hasn’t coalesced on one yet. The definition I’ll be using is that data science is an overarching discipline that includes elements from several fields and focuses on integrating them together from a systematic perspective. Continue reading “What is data science?”