Life in Estonia, part 8: Research
One sighs. One groans angrily. And a third voice spits out swear words. That’s us. That’s our office. And usually, when it happens, it will be followed by my boss commenting: „The sounds of making science.“ We may be staring at a computer screen all day, but this is indeed hard work. Thinking. Trying. Failing. Trying again.
When I procrastinate, I spend a lot of time looking at memes related with PhD or working in academia. One of my favorites is: “I didn’t want a 9 to 5 job, so I started a PhD, now I’m working 24/7”.There is definitely some truth in it. As a PhD student, you try to find a work-life balance. You are usually quite flexible in your work schedule, while you are doing literature research or statistical analysis, you can easily work from home, and nowadays all the classes are also online. And while it is easy to start every day at the same time, when do you stop? Some days I end my work day at four thirty, because I just can’t concentrate anymore and feel like I accomplished enough for one day. On other days I really get into a work flow, solve problems and get thrown out of the building by the night guard – yes, this happened. On other days (especially Fridays), my supervisor reminds me not to overwork, that I shouldn’t stay so long just because he does (well, I do start three hours earlier, so I think he has a point), and that I should take my weekends off.
|After 7 months, my name is finally on the door, too|
The thing is, when you are doing research, there is only one thing that everything else depends on: how much progress you make. The faster you get the analysis done, the faster you can start writing the paper. The faster you get the writing done, the sooner your peers can comment on it. The sooner you submit the article, the sooner you’ll get published. So every day you take off, every break you allow yourself, and every time you leave work early, you will feel bad for not making progress. Every weekend that I don’t work at all, I beat myself up for not working hard enough. Every day I don’t make a big step forward, I am scared that I will not manage this within four years (and I am only funded for four years, so after that, I will need to get money for rent and food from some other source).
|A very clear research plan|
But I also know that without enough sleep, without days of rest, my brain won’t function to its full potential. And my brain is the thing that all of this depends on (and Stackoverflow of course – so other people’s brains).
There are Saturdays where I sleep in, take time to cook healthy food and meet friends and feel good afterwards. There are Sundays on which I sleep in and do only one hour of reading or coding and I beat myself up for it. And then there are the best days: When I find a problem, solve it, and go home psyched about how much I learn and get done today. I eat dinner, I sit down on my yoga mat for my daily stretching routine and suddenly have an idea. So I reach over to my laptop, and right there on the ground, at half past eight in the evening, I open my statistics program and create a new variable that will tell me how long the interval between the first and the second calving of each cow was.
I go to the laboratory on weekends, because that means we can turn up the music and work without anyone walking by distracting us or needing any of the equipment that we are also working with. Sometimes I take another day off instead, but most of the time I do not, because I would feel too bad to do nothing on a Monday, and I also find the lab work relaxing, so it is absolutely not the same as doing my statistics.
|our famous whiteboard|
It is hard to get your research completely out of your head. And on other days it is hard to even get up and do any work at all. One day it is exciting and the next it is frustrating. You will spend months researching something that will ultimately make two sentences in your paper. You will spend hours reading scientific articles so that you can put a reference behind one sentence in your introduction. Every word in a research paper has been turned over three times.
So this is the life of a researcher in general. But many of you probably don’t really know what I am doing specifically.
So let me describe last Thursday, as it was a very typical day. Now, as a PhD student, I also have some classes, which take up around 6-7 hours per week plus some homework.
xi:regress weight_gain15months hapto birth_weight i.diarrhea i.mother_heifer i.treat_group age age_15months if age>7&age<15
The program then tells me that there are only 65 cows that have data for each of these factors and are thus included in the model. There are different values that I can look at to see if my hypothesis (that haptoglobin is correlated with weight gain) is correct, and I there is an effect, how big it is.
Sadly, these values (coefficient, adjusted R², P-value and confidence interval) are not as I want them to be. So my brain goes to work. Ah yes, ideally, dairy cows should be inseminated at the age of 14 months. This means, that some of these 65 cows in my model will already be pregnant, That is another factor that heavily influences weight gain, of course, and this can “mask” the haptoglobin effect.
I have tried a couple of times to explain what this masking means. Let’s see. Imagine I want to fill a pool (this is my outcome). There is a toddler with a little bucket putting water into the pool (my predictor variable), and a big pipe directing water into the pool. I look only at the toddler and ask: Is he filling the pool? The answer is no. There is a lot of water filling the pool, but it is not coming from the toddler with the bucket. So I look around and find the hose. Now I ask: Is the toddler adding water to the pool if we turn off the hose? Bucket by bucket he adds water to the pool and I can measure how much it is if I want to (effect size or coefficient). It turns out, in my model, the hose is a so-called cofounder.
So I need to create a variable for “pregnant at 15 months” and preferably, also how many days pregnant. For this, I need to get dates of inseminations, pregnancy check-ups and possible abortions for the cows. This is quite complicated and I don’t know how to do it. So I do what all programmers do: I see if anyone else has had a similar problem before on a page called Stackoverflow. Sometimes it is enough to google “how to calculate …in STATA” to find the code that somebody else has written and copy paste and adjust it to my own.
After over an hour of working on this, I can finally include this new variable into my model, and really, I get perfect results: The R², which measures the accuracy of the model (how much of the pool filling is explained by all the factors I am looking at) it at around 80%. The P-value, which tells me if my result is significant, is at less than 0.001 for haptoglobin, and the lower it is, the more we scientists like it. The coefficient is at -0.04. This means, that if the haptoglobin of the two week old cow is 1 mg/dl higher than that of another one, she will gain 40 grams less per day than the other one. Over time, this is actually quite a lot.
Before I get to do any more models, it is time for my weekly meeting with my second supervisor. He is a genius when it comes to STATA, so I get to ask him all the questions on how to best create the variables that will help me, how to fix mistakes in my code, and so on, and tell him about my progress. He also looks at the scientific poster that I have prepared for the conference at the end of the month and gives me some tips on how to improve it.
No time to rest; my wonderful colleague Elisabeth has arrived and we put on our lab coats. We need to measure the haptoglobin of more cows. This is done by a so called colorimetric method: We use different substances that will basically color-mark the haptoglobin in the blood serum, and our machine can measure the exact wavelength, from which the computer calculates back how much haptoglobin the blood contains. As the substances need to have time for their reactions, and we also have a lot of samples, this takes up a few hours.
|in the lab|
As I get back to the office, my first supervisor, the head of our chair of Clinical Veterinary Medicine and obviously head of our research team, has arrived. “Did you see the paper by Goetz, et al.?”, he asks. I have actually, it talks about weight gain in veal calves and also looks at the effect of haptoglobin. "I cannot believe they haven’t cited Leena’s article!” Leena is one of his former PhD students. I quickly open the PDF of the article he is referring to. It basically deals with the same topic, and I have been meaning to read it thoroughly for months, but never took the time. So now I do, and add a sentence to the draft of my own paper: “The negative association between haptoglobin and short term weight gain has been shown by Seppä-Lassila, et al (2018), and our current study suggests that the association is still visible after 15 months.”
Yes, a whole day of work for one sentence. I tell the boss what I discovered and he excitedly writes it on his whiteboard. “Interesting, that with the calves who had diarrhea, the effect stays, bit those from Leena’s study, who had respiratory disease, the difference in weight gain between high and low haptoglobin was gone after a few months!”
He sits down and I can see that his brain is working on an explanation for that.
We also need to discuss the exact plan for my next project, where I investigate the same things in sheep, the lectures I will start giving in September, and the masters’ thesis that I have to supervise next year.
But it is now already seven, I am hungry and tired, and it’s dark outside, so I pack up my stuff and head to my bike. We can continue this tomorrow.