The prevalence of code sharing (Goldacre)

Another greatest hit from the Goldacare report, in the section on open working, “The prevalence of code sharing” (all material reused under OGL)

  • “ONS covid reports: the team was unable to find any analytic code for the platform or covid analyses (but extensive and excellent open code training elsewhere)
  • OpenSAFELY covid reports: all code for the platform, data management and analysis all shared automatically on GitHub (declaration of interest: BG is PI on OpenSAFELY)
  • PHE covid reports: the team was unable to find any analytic code for PHE reports on topics such as ethnicity and COVID-19; but extensive code sharing for their (excellent) COVID-19 dashboards
  • DECOVID (Turing / HDRUK PIONEER platform created for a wide range of covid research teams from a large number of universities): the team was unable to find code for the platform or analyses
  • ICODA (HDRUK’s flagship COVID-19 data analysis platform initiated in June 2020): the team was unable to find code for the platform or analyses (but also no outputs to date)
  • HDRUK / NHSD / BHF TRE: the team was unable to find code for the platform; but some scripts are shared for a paper describing the data accessible through it, and one research preprint (the platform’s only output to date)”

I love that they did this, but that’s not why I’m writing a blog post about. What I love is the assumption that you should be able to sit at your desk with a web browser and find this stuff. That’s what “open” should mean. I’m so sick to death of being told stuff is “open source” and then I ask where it is they say “I’ll email you a copy”.

I plaster my code all over the internet. I shove it in the face of everyone I can think of to shove it in the face of, because I want people to see it. This “open source but you have to go to a webinar and then email me three times” is for the birds

The Goldacre review- greatest hits

I absolutely love the Goldacare review, I really can’t praise it enough, and I will be doing a lot of work based on it in the coming weeks and months looking at it from the perspective of my own Trust and, (with others) from the perspective of my ICS, and NHS-R. NHS-R has a couple of repos looking at matters related to the review (statement on tools and NHS-R vision) and we need to do some thinking about stuff that particularly comes out of this review that we can look at (which I have started doing on my own, community effort will follow).

Anyway, that’s all for the future. This blog post is just going to be very simple and just be Stuff I Particularly Love. As I’m reading it I occasionally find a bit which really resonates, and I thought it would be useful for me and possibly for others to record them as a kind of “Greatest Hits” outside of all of the other stuff I’m doing digesting it.

The Goldacare review is a blueprint for change, without doubt, that’s partly why I love it so much, and I intend to be that change and to push it forwards, but in the meantime this is just a bit of fun looking at all the best bits

Make it ‘okay to ask’ about access to publicly funded code

“The team heard from several interviewees that at present it is commonly regarded as provocative to ask for access to the code used to implement an analysis, especially in some parts of the academic community, despite general positive statements on open working”

I absolutely love this. I think it is regarded as being provocative sometimes, and people can get quite defensive. Very often people assure me that the code is forthcoming but it never is. It will surprise nobody to hear that I ask for code whether it’s provocative or not, and so should you. If it’s public money, it’s public code, and I’m sorry if that makes you uncomfortable but I’m going to put my hand up at the end and ask. Every. Single. Time. So get used to it

Pseudo-open working

“As a consequence of growing support for open working, there are now individuals and organisations who state that they support open methods, but do not do so; or create only the appearance of open working. During this review the team encountered examples of very senior and influential leaders extolling the virtues of open working, where their published papers from the pandemic in 2021 do not contain code, and require that interested parties contact them personally to negotiate access to the data dictionary codelists used to define the variables used in the analysis”

I almost whooped with delight when I read this. I’ve encountered this so many times and it makes me feel very cross. I hear so many warm words, and I see so many people extolling the virtues of openness, but very often it’s just a sham. They’ve no intention of sharing anything and they wouldn’t know an open licence if it bit them. I have more respect for the naysayers, at least they’re honest.

That will do for this post. I’m sure there will be more, and I’ll add more posts as I go.