What made me do a PhD far away?


Two years ago I was in the middle of my Bachelor's in Munich and had almost no clue about all this Artificial Intelligence stuff. I would have assigned a probability of almost zero to being in my current situation in April 2020: I am about to immigrate to Canada for a PhD at Mila, the biggest AI research institute in the world. So how did I get to that point? How did I figure out what I want to do in life?

This is the first of two blog posts:

  1. Tell the story of how I decided to do a PhD in North America and how I ended up being so driven for research in AI.
  1. Give others out there insights into the the whole PhD application process from the first vague ideas/plans to the final visit days. This is even more useful if you come from a similar situation: European with only a 3-year Bachelor, feeling like you don't have enough "formal background" in the field.

The story

I can pinpoint 4 events or choices that turnt out to be important and "the right one" in retrospect. Of course there are always countless more factors involved but let's simplify life for now:

  1. Quitting my studies which left me with 9 months to determine where my life should be going and to learn to reflect.
  1. Joining a Deep Learning startup full of driven and supportive people. This showed me that work does not feel like work with the right people, and reaching ambitious goals together is amazing. To foster each other's ambitions, keeping each other accountable. Questioning conventional thinking.
  1. Hearing about PhDs in the US without having a Master's.
  1. Twitter.

1. Learning to face myself

I had finished high school with a (sort of) perfect final grade and was ready to go to university. I had awkwardly decided for mathematics because that's what I liked in school and what was philosophically the "purest, cleanest path". I knew that it was not like high school maths (everyone tells you that) but I still thought: I'm smart, it will make sense to me and things will magically work out if I just "do the homework well etc." To be honest, I can't really understand anymore how the Benno of 5 years ago truly thought about life but it seems very different to my current self.

In hindsight I was naive and had spent embarrassingly little time actively going out into the world to find out what I want to do. Or to critically reflect more. Next to studying and having fun discussions with people, I had tried few academic activities during high school that would have given me a clear direction. I had spent much of my spare time playing Ultimate Frisbee and computer games. Sure, I was very engaged in classes, always eager to discuss problems with teachers or friends, and read or watched some sciency things here and there.

But actually attending university lectures to see how a real university works? Actually cold-emailing people from industry or professors? Actually joining clubs? Learning to commit to personal long-term projects? Nope, I simply wasn't that pragmatic, brave or proactive. I was happy just playing Frisbee and getting good grades. This is still a pleasant life in some sense but at the same time I had naive ambitious career dreams, I just didn't have the right "mentality".

Quite frankly, I was also sort of addicted to playing computer games during my last year of high school, a story for another blog post maybe. So it was surprising that I had managed to finish my final high school exams with good results and even still played Ultimate Frisbee regularly.

After three months of studying mathematics I was quite unhappy. I decided that I needed more time to figure out where I want to go in life and how to get rid of my computer game addiction, which was my way of avoiding to face "real life" (the more common method for that nowadays seems to be Netflix). So I stopped attending my maths classes. This was a big turning point in my life and I am grateful that I took this tough step. However I am careful to say it was "the best decision of my life", since there are too many of those.

I made the decision around Christmas and in the following weeks I spent my time just writing and writing, page after page: about my social life, about what I care about, what it means to be a good person and have a fulfilling life, and also about some past issues. Eventually after 2 or 3 monthts I also quit playing computer games fully. The newly gained time was also spent on a part-time job at a search engine company, before I decided for Computational Linguistics as my major.

For the first time in my life I truly learnt to reflect and take active control of my life.

This was also the point where I became "my own best therapist" through writing and very honest reflection. Since then I can trust myself to handle every big life issue if I just sit down and write long enough.

2. Becoming ambitious

Going forward 2 years (and skipping great times such as 2 Erasmus exchange semesters in Dublin), I ended up at Hellsicht, a Deep Learning/Computer Vision startup.

Up to this point I had spent my studies with good grades and fluctuating curiosity but had not find my "true" path yet. However I was a happier and more confident person. I was playing a lot of Ultimate Frisbee and had a richer (social) life, finding more and more confidence through that.

Considering that I had almost no real experience in Machine Learning and no long-term goal, I was very lucky to end up among so many smart and driven people at this startup. There was a lot of luck and getting-along-well involved.

To give a bit of context about this somewhat unusual Deep Learning startup: it started out without any external investment for the first years, just a bunch of friends, some knowing each other already since 5th grade, several studying physics. They just kaggled and kaggled (a platform for Deep Learning competitions with prize money), back around 2015 when Deep Learning was slowly becoming mainstream. They lived a simple life since there was no investment for the startup, and since it just started out as a big hobby. Additionally, the extent of support and trust in each other was quite extreme. Here is an anecdote that captures it:

Next to being the closest friends for many years, there are stories such as two founders going into physics exams together with the rule that both leave if one of them signals that they don't see themselves passing the exam. Note: German exams in maths and physics have extremely high failure rates, often around 50-80%. Note: both are super smart people and graduated with good grades but they also spent a lot of time kaggling or on other stuff which meant exam preparation was not always a top priority ;)

So this is the environment where I found friends (for life?), my goals, and most importantly the ability to work hard on things I love without feeling like it's work.

What I learnt:

  1. Put effort into finding the right people with whom work feels like play. No matter what advice I get from others, I will not separate work and friendships. Yes, it can fail, but if it works (which you can influence very deliberately!), it is the an amazing experience to be best friends with your colleagues/fellow PhD students. You go through everything together sort of.
  1. Becoming good at something takes time. The path is the goal. All that cliche motivational stuff.
  1. Be confident, take risks, think big, criticize each other constructively, have a vivid wild debate culture.

I worked at Hellsicht for 11 months, roughly 20 hours per week next to my studies. I worked with joy, whenever I wanted, sometimes taking naps in the office, sometimes staying late, sometimes coming in late, having philosophical discussions until midnight with the others. After that time I was a different person in many ways. I was ready and mature to go out on my own into research.

3. Hearing about this PhD thing

A year before I applied for a PhD, at the start of my 3rd year of undergrad, I had no experience in academic research at all. But at the same time I wasn't sure yet whether I could already rule out "researcher" as a valid life path. So the only logical conclusion was: The best way to find out whether "researcher" might be a valid path is to practically experience what research is like (instead of passively googling and making pro-con lists). I knew that from an idealistic point of view I would like to be a researcher but I had also learnt from experience that one should primarily judge these choices by how they practically feel on a day-to-day basis. So I talked to the two professors at my faculty who work on Natural Language Processing: to Hinrich Schütze about wanting to publish my Bachelor thesis at a conference under his supervision and to Alex Fraser about a research assistant position in his lab. If I remember correctly, at this point I already considered academia a bit and knew that early research experience can be valuable. I also thought it's a waste to write a Bachelor thesis which is just read by the supervisor and nobody else; hence publishing. While both professors agreed to supervise me, it did not get me closer to answer the question: "What to do after my 3-year Bachelor which is about to end?". The standard thing would be a Master's, either in Germany or maybe in the UK or the Netherlands. Or maybe just more startups and industry stuff? If that was my destination after my Master's anyways, why not go straight for it and learn on my own?

Then I had lunch with an American PhD student who was spending a year at Prof. Fraser's lab as an exchange. When we got to discussing my future plans, he said something along the lines of "Why not go straight for a PhD if you like research? In the US, some do that." While I might have heard this fact about US PhDs at some point, I was surprised and my first reaction was "Wait me? My BSc is only 3 years (in the US it's 4 years) and I've heard it's super competitive in AI these days... Also I didn't study pure Computer Science". But we talked more, he gave me some confidence: If my work with both professors results in two papers, I'd have decent chances.

This sounded cool and I played with the thought more and more. But it was still quite vague and seemed like a lot of effort. Also, a PhD on what topic? I still wasn't fully convinced by the AI research I had seen at this point like computer vision (object detection, segmentation) or text-only NLP (translation, commonsense, reasoning, ...). This brings me to the final nudge.

4. Finding my people and my research vision, or in short: Twitter

Shortly after I had started my two research projects, around April 2020, I joined Twitter because it seemed like a good place to hear from researchers. Also a way better alternative to LinkedIn or anything like that. However I didn't think it would play any significant role in my life.

I don't know how exactly it happened: I somehow ended up following and reading content from the "right" kind of people/researchers. I saw someone retweeting a paper called "Experience Grounds Language". I saw a lot of good advice about academia and surely also some bad advice. I saw personal stories from PhD students. And more and more, by following the connections - who mentions who, keywords, retweets - I ended up finding out that a lot of cool researchers work on language grounding, multi-modality, RoboNLP and all the other creative names for this stuff. And I also saw a lot of encouraging tweets on academic life, despite the wide-spread opinion that Twitter makes academia seem overly terrible because of all the rants of academics on the platform.

In short: academia became something I could relate to through personal stories and discussions, and I slowly got deeper and deeper into this language grounding thing. After a while I collected all language grounding researchers I could find on Twitter in a single group (so called "Twitter list"), so that others could use this list to stay up-to-date with the field. Here is where I shared this list in a tweet. This tweet lead to many encouraging interactions, even with actual professors: some retweeted it, some commented, some followed me. If I had known at this point that I would interview with 3 of these professors for a PhD half a year later! In some way this tweet is a perfect example of how Twitter was a suprisingly big factor for finding my research passion and ultimately starting a PhD.

After that it just continued like this: more researchers started following me, I participated in Twitter discussions, met people at ACL (a big NLP conference), and even had some nice email exchanges with professors that further encouraged me content-wise but also on a personal level. In short I had found my community. I was determined to do a PhD on language grounding.

In Germany almost none of the people I know use Twitter, except for the PhD students here. So I can't recommend Twitter enough if you are in a sort of similar situation like I was. It has its flaws but few things give you such an unfiltered diverse exposure to academia, content-wise but also for questions like "what does doing a PhD entail?", "how do conferences/publishing/reviewing work?", "how to apply to stuff?" etc.

Making it real

In the end both projects got published at conferences and I applied to roughly 15 PhD programs in the US, Canada and Edinburgh. I was accepted to a few and decided to join MILA in Montreal this March (long Twitter thread by me on why I chose MILA). In the year between my Bachelor and the PhD, I am doing 3 research-related internships. The whole PhD application process from start to end felt like an additional part-time job for several months if you want to get everything right. I am happy I still did it. But more on the "practical" details in the next blog post!

Thanks to people

In the end it is always concrete people that can make a big difference: mentors, people who take some of their scarce time for you, encouraging words.

I am very grateful to Nora Kassner, Denis Peskov and Dario Stojanovski, 3 PhD students who supervised and mentored me. From these collaborations friendships also developed.

On the more senior level, Alex Fraser and Hinrich Schütze had wise words and guidance for me.

During my PhD applications and decisions, I was sometimes anxious or needed inspiration: Jesse Thomason and Yonatan Bisk (both authors of the paper that inspired me deeply) were extremely helpful and empathetic. Two professors who truly care about students and a healthy university environment. These are the kinds of mentors and supervisors I want to look up to if I ever get the chance to supervise students myself as a professor.

Less often mentioned, but equally important: it is a lot harder without a partner/family/good friends and hobbies so that you're not with your own thoughts, plans and worries 24/7.

This already reads like a small acknowledgement section in a PhD thesis before I even get started with it 😉

I guess it is a good practice and will hopefully look even more awesome at the end of my PhD! Acknowledgement sections are the best part of a PhD thesis and reflect the best parts of a PhD.