Wow! It’s been almost three years (be nice) since I updated this.
I wish that I could say that it’s been busy (though it has) or that I haven’t had anything to say, but it’s more that I’ve been letting my tendencies towards procrastination get the better of me. I’m going to try to make this a more regular thing, though, in the future.
Thanks to current events, my group at work has largely moved to working from home, and don’t click away thinking this will be another piece telling you how to work (or teach!) from home, I’m the last person in the world to be giving advice on either one of those topics. I want to talk a bit about some of the ways that I think these changes are going to affect the way we operate as educational data scientists.
A World Without High Stakes Tests
For the first time in a long time, we’re not going to have data from our confusingly named “high stakes tests” to use to judge how well (or poorly) we’ve done this year. This is a huge change, and it’s something that any analyst for foreseeable future is going to need to take into account.
I have always assumed that HSTs are simply a fact of life, as unchangeable as taxes and constant as hard to understand accountability systems, and I suspect most analysts approached them the same way and I’m sure that accountability systems do. So what does the lack of a single year of HSTs do? The first thing is that it means that we’re not going to have data for this year – I know, it sounds obvious – but the implications might not be. Some potential impacts that I could see happening are:
- Any processes that rely in whole or in part on those scores will likely be “paused” for a year.
- Some places use HSTs (or other state assessments) as part of their criteria for having students exit english learner status. Those areas will have to decide if it’s better to not reclassify a student that might be ready or to develop local measures and risk them being wrong.
- Growth measures – and any other metrics that rely on multi-year data – will stop working. My guess is that many states will just declare that the 2020 school year doesn’t count, and compare 2019 to 2021, but that brings its own problems in terms of comparability.
- Our time series have now have a huge problem. This missing data point is going to be an issue for anyone looking at data over time. As data scientists, we have several options for dealing with this. The ones that I’m considering at the moment are:
- Dropping the (non-) observations – Since almost nobody is going to have data, and things are certainly not normal, there’s a good argument to be made for simply dropping this year’s observation, and possibly next year’s as well.
- Coding the observation(s) – Again, we should already know that next year is unlikely to really be “normal” either and is going to be an outlier. Coding it as such may allow us to extract some data from it without pretending that it was normal.
- Impution – We could attempt to impute what the observations “should be,” but that is not something that tends to work well in a time series, especially when – as it is reasonable to suspect – we’re going to see a level shift after this year.
- Start Over – For those with time, this might be the best approach.
- Without the HSTs, understanding the effect of COVID-19 on the educational system is going to be harder. This is not an argument that the tests should not have been suspended, simply an acknowledgment of the reality. We use HSTs in too many parts of our self-evaluation systems, not to mention the accountability systems that give them the name “High Stakes Test.”
Hattie’s Nightmare
I’ve seen people labelling this period as “the great distance learning experiment” and saying that this will prove this, that or the other statement is true. This is concerning thinking, because this does not fit the expectations of an experiment, even in our looser, big data mindset, world. We’re not controlling variables, there’s not careful design being done, and we’re absolutely not forming a hypothesis. We’re reacting, plain and simple, and while thoughtful analysis after a crisis is critical to learning from it, trying to pretend that this situation is at all representative of a controlled rollout of distance learning is nonsense.
Most likely, what we’ll see is that – just as with most everything else in education – the teacher and student matter far more than we supposed. Some teachers and classes will probably thrive in this model, and if you p hack hard enough, you’ll almost certainly be able to “prove” anything you want. Error bars are going to matter even more than usual in the next few years, so make sure that you take the inevitable “such and such LEA did this during distance learning and you won’t believe what their students scores did!” stories that are already starting to appear with a grain of salt, or perhaps an entire box.
One reason for optimism on this front may be that the diversity of district, school and teacher initiatives and responses will eventually allow us to separate out the “average” effect size of this year’s dislocations, but this is going to take many years.
Where Are We Now?
From an educational perspective, the impact of this year is going to be the number one question we talk about for a while. It’s going to be a massive asterisk next to any study done over this period, and any study is always going to be questioned with “was the effect what you did, or was it the distance learning?” I’ve not yet done one myself, but I would imagine that anyone doing their defense in the near future should prepare for that as well.
We need to be prepared for this unknown. It’s not going to be something we can change. That means that we’re going to need more sophisticated models and tools.
All of this is going to mean that real data scientists are going to become even more critical to support LEAs, because there’s not going to be a uniform “COVID-19” adjustment that districts are going to be able to apply to every measurement.