Writing software for science

Ed Griffiths, Wellcome Trust Sanger Institute
United Kingdom

Partner away, so up just before 7 to sort out domestic stuff: kids swimming, music, rugby trip tomorrow, dinner money etc...most complicated part of the day logistically and patience wise. Drive to the Sanger Institute through Spring countryside just bursting into life again, the drive a small oasis of calm. Get coffee, talk to colleague about why code won't build on the Mac and what to do about it.

Quickly reduce incoming emails to sensible number while reminding myself of the essential goals for today. Get on providing screen dumping and printing facilities from current software, trivial but so vital for those conference screen shots. While doing this discover and fix an "off by one error", whether you are counting fence posts or DNA sequence bases its amazing how easy it is to make this mistake.

While doing this also think through again about our approach to simultaneously displaying multiple sections of DNA from different organisms in a way that hopefully will actually be helpful to annotators. Investigate and respond to a couple of bug reports from users of some older software which I maintain at the Sanger.

My job is to write software to help geneticists and bioinformaticians to annotate DNA, i.e. show where genes and other features are positioned on the DNA, add references for supporting experimental work and so on. This involves understanding a substantial amount of sequence terminology and bioinformatics techniques and principles. It involves communicating with geneticists in many other laboratories around the world and responding to their queries and requests about our software.

The work also increasingly involves discussions with other programmers in the field about how to exchange information, the semantics, the format, the programming languages, the types of computer that will be supported and so on. Just about everything we work on is public which is very refreshing in comparison to work for a previous employer where almost everything was not. Its not unusual to spend a week at another laboratory working with someone else in the field to "get over the hump" of a new piece of code which will then be shared generally between researchers.

Funny to reflect how in a way I've come full circle. I started out as a biologist (ecology in fact), then moved in to commercial computing for many years, now here I am somewhere between the two!

DNA sequencing technology has appeared since I was a biologist but my background has really helped in understanding the principles behind it, how sequencing is done, how candidate genes are identified and so on. This is important because the programming here is very applied, most of the users are in the next room or certainly just down the corridor and emails/visits to discuss problems/improvements are frequent.

It has been a real bonus having experience of both fields but it requires constant work to keep up to date with the changing techniques and environment of both programming and the world of DNA sequencing and bioinformatics.



(needs editing)

OnSET is an initiative of the Science Communication Program
URL: http://www.onset.unsw.edu.au     Enquiries: onset@unsw.edu.au
Authorised by: Will Rifkin, Science Communication
Site updated: 12 May 2006     © UNSW 2003 | Disclaimer
Science UNSW - The Best Choice
CRICOS Provider Code: 00098G