Adding line returns in RMarkdown in a loop

Another one that’s for me when I forget. The internet seems strangely reluctant to tell me how to do this, yet here it is buried in the answer to something else.

Sometimes you are writing an RMarkdown document and wish to produce text with line returns between each piece. I can never work out how to do it. It’s very simple. Just two spaces and then \n. Like this ” \n”. Here’s some real code with it in.


  team_numbers %>% 
    mutate(print = paste0(TeamN, TeamC, "  \n  \n")) %>% 
    pull(print) %>% 
    cat()

Simple!

Producing RMarkdown reports with Plumber

I wasn’t going to post this until I got it working on the server but I’ve got the wrong train ticket and am stuck in London St Pancras until 7pm so I thought I’d be productive and put it up now.

So I launched a new version of the patient experience dashboard. I forget if I mentioned it on here or not. I think not. It’s here http://109.74.194.173:8080/apps/SUCE/ and the code is here https://github.com/ChrisBeeley/patient-experience-dashboard. (I should add that I’ve launched a bit early because of an event so it will be a little buggy- forgive me for that).

One of the things we’ve done is get rid of the static reports that we used to host on the website, which we’re just generated as HTML, uploaded to the CMS and left there. We have preferred instead to switch to a dynamic reporting system which can generate different reports from a fresh set of data each time (I’ve barely started to implement different reports, so it’s very bare bones at the moment, but that’s the idea, at least).

One of my users, however, liked the way that the old reports would have links to all the team reports on. So she would send out one weblink to service managers and it would have all of the different teams on it and they could just click on each one as they wished. So I need to replicate this somehow. My thinking has gone all round the houses so I’ll spare you the details but basically I thought originally that I could do this with Shiny, generating URLs in each report that would contain query strings that could then be fed into the report generator, which means that clicking each link would bring back each team report. I can’t say for definite that it’s impossible, but I couldn’t figure out a way to do it because Shiny downloads are much more built around the idea that your user clicks a button to get the report. I could have had the link generate a Shiny application with a button in that you press to get the team report, but that’s two clicks and seems a bit clunky. Really, Shiny is not the correct way to solve this problem.

So I hit upon the idea of doing it with Plumber (https://www.rplumber.io/). There are very few examples that I could find on the internet of using Plumber to generate parameterised reports with RMarkdown, and none at all generating Word documents (which everyone here prefers) so I’m offering this to help out the next person who wants to do this.

Just getting any old RMarkdown to render to HTML is pretty easy. Have a look at the Plumber documentation, obviously, but basically you’re looking at


#* @serializer contentType list(type="application/html")
#* @get /test
function(res){
  
  include_rmd("test_report.Rmd", res)
}

If you run this API on your machine it will be available at localhost:8000/test (or whatever port Plumber tells you it’s running on). You can see where the /test location is defined, just above the function.

Easy so far. I had two problems. The first one was including parameters. For all I know this is possible with the include_rmd function but I couldn’t work it out so I found I had to use rmarkdown::render in the function. Like this:


#* @serializer contentType list(type="application/html")
#* @get /html
function(team){
  tmp <- tempfile()
  
  render("team_quarterly.Rmd", tmp, output_format = "html_document",
         params = list(team = team))
  
  readBin(tmp, "raw", n=file.info(tmp)$size)
}

This API will be available on localhost:8000/html?team=301 (or whatever you want to set team to).

The RMarkdown just looks like this:

---
title: Quarterly report
output: html_document
params:
  team: NA
---

`r params$team`

You can see you define the params in the YAML and then they’re available with params$team, which will be set to whatever your ?team=XXX search string is.

Okay, getting there now. The last headache I had was making a Word document. This is only difficult because I didn’t know to put the correct application type in the serializer bit on the first line that defines the API. You just need this:


#* @serializer contentType list(type="application/vnd.openxmlformats-officedocument.wordprocessingml.document")
#* @get /word
function(team){
  tmp <- tempfile()
  
  render("team_quarterly.Rmd", tmp, output_format = "word_document",
         params = list(team = team))


  readBin(tmp, "raw", n=file.info(tmp)$size)
}

This will be available at localhost/word?team=301.

That’s it! Easy when you know how. All the code is on Git here https://github.com/ChrisBeeley/reports_with_plumber.

I’m pretty excited about this. I did it to solve a very specific problem that I had for one of my dashboard users but it’s pretty obvious that having an API that can return reports parameterised by query string is going to be a very powerful and flexible tool in a lot of my work.

I’ll get it up on the server over the weekend. If that causes more headaches I guess I’ll be back with another blog post about it next week 🙂

Suppress console output with ggplot, purrr, and RMarkdown

So I posted a while back about producing several plots at once with RMarkdown and purrr and how to suppress the console output in the document.

Well, I just spotted someone on Twitter having a similar problem and it turns out that the solution actually doesn’t work in ggplot! Interesting…

For ggplot, you need to excellent function walk() which is like map() except it’s called for its side effects (like disk access) rather than for its output per se.

Bish bash bosh. Easy


```{r, message = FALSE, echo = FALSE}

library(tidyverse)
walk(c("Plot1", "Plot 2", "Plot 3"), function(x) {
  
  p <- iris %>%
    ggplot(aes(x = Sepal.Length)) + geom_histogram() +
    ggtitle(x)
  
  print(p)
})
```

A world of #plotthedots and… what else?

image

Reproduced above is a recent exchange on Twitter. I’d better open by saying this blog post is not impugning the work of Samantha Riley or any other plot the dots people. On the contrary, the whole #plotthedots movement is an important part of a cultural change that is happening within the NHS at the moment. But I would like to take up the rhetorical device Samantha uses, to explore the issues that we have understanding data in the NHS.

Let’s take the tweet at face value. A world where every NHS board converted from RAG to SPC, along with CCGs and regulators. It’s worth noting that this would, in fact, be a substantial improvement on current practice. RAG rating presumably has its place as a management tool but as a data analytic instrument it is very lacking. RAG ratings treat 30 week waiting lists the same as 18 month ones- 16 hour waits in A and E the same as 6 hour waits. They give no help with interpretation in regard of trend or variance. In this new world boards will be able to distinguish chance variation from real changes. They will have due regard for both the trend and natural variation in a variable, and be able to adjust their certainty accordingly. This is all to the good.

But let’s think about all the things that we’ve left out here (I don’t doubt the #plotthedots people are quite aware of this, I’m just using the tweet as a jumping off point).

We’ve left out psychometrics. Are the measures reliable and valid? We don’t know.

We’ve left out regression. How much variance does one indicator predict in another? We don’t know.

We’ve left out multiple comparison. We reviewed 15 indicators. One shows a significant change. What is the probability that this is due to chance variation? We don’t know.

We’ve left out experimental design. We’ve reviewed changes in measures collected on four different wards- two of which have implemented a new policy, and two of which have not. Is the experiment well controlled? Are difference due to the intervention? We don’t know.

We’ve left out sampling theory. We have patient experience questionnaires in the main entrance of our hospital and they are distributed on wards on an ad hoc basis. Are the results subject to sampling bias? If yes, how much? We don’t know

We’re interested in producing graphics to help managers understand patient flow throughout our organisation. Excel can’t do it. What can we do? Nothing! In the bin!

I’m obviously exaggerating a little here, for effect, but the sad fact is that the NHS is in a very sorry state as far as its capacity for analytics goes. Many individuals who have the job title “analyst” in fact would more properly be called “data engineers”, in the sense that they can tell you very exactly what happened but don’t have a clue why. There’s nothing wrong with that, data engineering is a highly valuable and difficult skill, but it’s not analysis, and in fact career structure and professional recognition for true “analysts” (or, let’s dream really big here, “data scientists”) is sorely lacking everywhere.

I passionately want proper understanding of and engagement with data and statistical concepts right from board all the way across the organisation, and in fact I am busy at the moment in my own Trust offering in depth tutorials and one off lectures on concepts in data and statistics. I strongly support APHA’s aim to introduce rigorous professional accreditation of analysts.