How to talk to your bioinformatician
1 Introduction
Please note: this brochure is a work in progress, barely started at this stage. The statistics chapter is largely ready, but others are at various stages. However, there still is the presentation (bihealth.github.io/howtotalk) which is actually where my idea started.
If you have any comments, suggestions or corrections, click in the top right corner of this page on the \(<\) icon to open the Hypothesis annotation sidebar – it requires a free login, but is otherwise really useful and definitely worth it. Alternatively, file a github issue at the repository for this book – github.com/bihealth/howtotalk-book.
1.1 Who this book is for
The book records the different bits and pieces of advice I keep hearing from and telling to my collaborators, students and colleagues. They are mostly based on my experience as a statistician and computational biologist working in a biomedical research environment. A lot of what I have to say I had to learn the hard way – by commiting the mistakes I am now warning you against. I chose what things to include in this book by asking myself: “What do I wish I had known when I started out?” and: “What do I wish my collaborators knew when we started working together?”
1.2 Who am I to tell you this stuff?
1.3 Bioinformatics, statistics and computational biology
We often refer to the whole general category of humans involved in biological data analysis using computers as “bioinformaticians”. However, strictly speaking, bioinformaticians in the proper sense of the word are those who develop algorithms, databases and software packages. Then, there are those who use these tools to actual data, and the more precise term for these people is “computational biologists”. Finally, there are those of us who walk around and criticize experimental designs and they are called statisticians.
The truth is most of us have at least set a foot in all three camps, but experience and interest usually make us specialize in one of them. In especially statistics appears to be separated from the other two, with many bioinformaticians having only a very limited understanding of it, and many statisticians having no experience with high throughput biological data and other applications which are the bread and butter of bioinformaticians and computational biologists1.
1 A notable case in point are practical calculation of statistical power in high-dimensional settings, which require expertise in both, statistics and high-throughput data analysis.