The last post perhaps was a bit over-long on extraneous detail and bits of R-commands that most readers will either know backwards or have no interest in, and today’s learning sheet is over double the length of yesterday’s, so let’s really do one thing I learned today.
One thing I learned today was about the RTextTools package which sounds like a great way of classifying textual data. I’ve already had a go with some text stuff in a previous post but I hadn’t really thought about the whole data mining/ machine learning thing all that seriously until today. I do wonder whether this would have applications to the patient survey (900 responses a quarter, and actually now I think of it we have about 1500 from early in its life that were never classified). There is a person who codes the responses as they come in, and I don’t need or want to take that job off him, because clearly a person is always going to be better than a machine at classifying text, but if I could prototype it then one intriguing possibility would be raised. It would enable us to change the coding system. We set the whole thing up quite quickly and I did rather think we were stuck with the coding system forever (since we don’t want to recode 6000+ rows) but if the computer can do it reasonably well then we could change it as we saw fit. We could even have different coding systems for different parts of the Trust, all sorts of things. All interesting things to muse on.
It just goes to show what is possible when you use R and keep up to date with the amazing community. One of the presentations suggested that people do not always appreciate the work of the R-core team (who devote many thousands of hours completely gratis to R). For my part, I would say that I certainly do appreciate their efforts, moreover I find the whole story of R quite inspiring. R really was my introduction to the world of open-source, and there are some amazingly generous figures in the whole community, the R-core team not least among them, as well as other heroes from Linux, OpenOffice etc. Not only that but a countless army of minor heroes who share their work and source code every day. I’m rather too much of a novice to make a serious contribution but I hope one day to follow the example of these minor heroes and give something back, however small.