In defence of eating

One weird thing in remote/ pandemic times is that people have started turning their cameras off when they eat. I’m all for the right to turn your camera off, I think you should be able to do that at any time without giving a reason, but I think it’s a shame if people think that they can’t eat on camera. I have therefore been glad to eat several large, difficult to eat things in meetings on camera recently, to perhaps give others the idea that it’s okay if they want to do it too.

So far I’ve eaten a footlong Subway which was kind of falling apart really and today it was half of a pretty massive pizza which was also a bit lacking in the structural integrity stakes. So please join me if you wish, we all used to eat in front of each other in The Before Times, it’s totally natural and normal- but also if you don’t want your camera on because you’re eating or for any other reason that’s okay too.

In defence of dashboards

I’ve had lively debates about dashboards with various people, including someone in my own team, and somebody on Twitter just mentioned that dashboards are often not used (this blog post will be my response to this tweet, not the first time I’ve answered a tweet with a blog and very on brand for me 😉).

I should acknowledge at the top that I’m The Dashboard Guy. I’ve written books about making dashboards in R. It’s my role in the data science team in which I sit. Shiny in production. It’s my thing. So take all this with as much salt as you wish, I promise I won’t be offended.

People say that dashboards proliferate, and that nobody uses them. That is quite true in a lot of cases. I’d like to suggest why that is so, and talk about when people do use them.

The first thing to say is that in the NHS (which is where I have always and will always work) many staff are not engaged with data in general. They’re not engaged with data, they’re not engaged with analysis, they’re not engaged with reports, they’re just not engaged. They see data as a punishment, a “test” they cannot win. They can’t see the point of it and it’s just a distraction at best and over-critical and unfair at worst.

So my first question would be what do you replace the dashboards with? What will they engage with? I became The Dashboard Guy because we used to make a 300 page report by hand every quarter. It took literal person weeks and every recipient was only interested in their four page chunk. It was ridiculously inefficient. But everybody did want their four page chunk. Dashboards are useful for data that people want to see.

The team and I have recently built a classification algorithm for free text patient experience data. There is no way on earth anybody could use that without a dashboard because there are thousands of points of data and you are typically only interested in a few hundred at a time. So we built one. If they are using the algorithm at all, they’re using the dashboard (or someone else’s, it’s open source so you can DIY the dashboard if you want). If they’re not, that means they’re not looking at what their patient experience data is about, or they’re reading a 4000 row spreadsheet (and I do know people who have done/ are doing that).

So I think dashboards are very useful when you have a large highly structured dataset where everyone wants their own bit, and I’ve deployed perhaps three or four that have certainly been used by the individuals who wanted that data.

But something else that I think they can be used for is putting data science tools in the hands of your users. I just built a dashboard that allows you to pick how many LDA topics you want to use, and then shows you:

  • Term frequency for each topic in a graph
  • Five example comments from each topic

(I should say, it’s very rough and unfinished, but you get the idea). The idea of this is to allow the person who is interested in the data to make the topic model work themselves without writing R code. In this particular project all the data hasn’t come in yet anyway so the dashboard will contain more data in the future. It may be that the first tranche of data contains four topics, and the final dataset six. Building a dashboard makes the enduser part of that decision making process.

In fact, I actually build them for myself. A few years ago I was fitting a lot of structured topic models and it was such a pain in the neck fiddling around with the code that I just made a dashboard for myself so I could just sit and play around with it. The analysis itself took days, tens of hours, and the dashboard took an afternoon (to be fair, I’m probably a little faster than the average person with Shiny at least, because I have a lot of experience with it).

If your users are not using your dashboards, ask yourself why that is so. If they aren’t engaged with data or don’t care about the metrics then you have a bigger problem than dashboards. Would they engage with a report? An email? A bullet point in a team meeting? Build what they want, not what you think they want or should want.

Dashboards give you an opportunity to devolve analysis to your users. I work with a lot of text data. The users are the experts, not me. I want them to be able to do all the stuff I do in code but in a browser- sort, filter, summarise, aggregate, and even tune models (do they want five topics or fifteen? Do they want an overinclusive multilabel classification algorithm that captures all of a theme, or a conservative one that only shows them the highlights?).

In my opinion the dashboard is the icing on the cake. The team and I are building a very large, complex dashboard summarising many types of data, in the hope of driving customers between the types (patient experience data users looking at staff sickness levels, clinical effectiveness data users looking at all-cause mortality). But the work is in uniting the data, in producing analytics that are meaningful and useful. I don’t think it’s terribly important whether it’s a dashboard or a report or just played over the PA every Monday morning (there’s an idea 😉). Engage people in data and analytics first, and then serve that need using all the tools at your disposal.

Make statistics sexy

I’ve been enjoined to make statistics sexy. It’s really taking root in my brain.There are no easy answers in statistics; it’s a long, hard road. It’s rigorous, and honest, and that’s why I love it. Frank Harrell’s talk at NHS-R typified this. The price of truth can be very high- like binning all the data from your £1m study instead of doing what many do which is slice it into so many pieces that something looks shiny and publishing that.

At the same time there’s a lot of hype in ML and AI. There are serious minded people doing ML properly and making a real difference but there are also a lot of private companies selling the NHS snake oil in a new, fancy bottle and giving them overfit, janky, ungeneralisable models or even just selling them a pile of promises and a dashboard and leaving them with nothing.

How can the “I think you’ll find it’s more complicated than that” brigade compete with the “Our data science team showed that this model increased patient flow by 17%” (but please don’t ask us about pre-existing trend, or the Hawthorne effect, or the other 8 models they deployed that didn’t do much at all).

I truly have no idea, but I’m going to have a good go at finding out. Rest assured if I discover the answer I shall write it here first 😃