Tomorrow is my last day at Graphcore.
A little over 3 years ago, I officially1 joined this company to contribute towards the vision of advancing machine intelligence to expand the scope of good that humanity could do. By this, I personally meant to help address, where it made sense, some of humanity’s greatest challenges head on via advancing machine intelligence in concert with advances in IPU chips and systems, unburdened from the trends or incrementing on the literature as it were. This included advancing machine intelligence as an endeavor in and of itself, both from the capability and safety2 perspectives. Whist personally this remained (and continues to remain) my guiding star, I found myself in an environment which had room for me to grow in ways I hadn’t fully anticipated.
Shaping technology expanding the frontier of machine intelligence
Having spent most of my adult life working on machine/artificial intelligence (AI) in some form or the other, I expected to drive the conceptual side of things, to come up with novel neural architectures and learning schemes that would get realized in software in tandem, and unblock paths to unsolved problems across various domains of human undertaking. Little did I know that driving software features in the context of “merely” the state of the art in the field, thus ripe with opportunities to make advancements lower down the software stack, is what I’d actually end up doing. I even enjoyed it!
It has been a memorable ride. From working out the efficacy of training multi billion parameter models under execution schemes yet (at the time) to be realized in software, to (more recently) driving a cross stack and function team effort to put in production ideas that make training large models in reduced precision numerically stable, in particular Automatic Loss Scaling3 — I had the privilege to drive, contribute, learn and grow both technically and professionally along the way.
Automatic Loss Scaling from PyTorch down to IPU-PODs to realise stable training of large models.
Perhaps a few people at Graphcore can say that they did work that would not be possible elsewhere or outside organisations with a large amount of compute. I certainly am one of those who ticks both boxes, having had the luxury to touch knowledge that not many on the planet can, due to sheer compute requirements, and at a technical depth that was only possible at Graphcore. Of course, this was with a view to enable everyone, as an example, seamlessly navigate the numerical vagaries of large model training, with just a single line of code. The compute was designed and built from the ground up by this magnificent AI systems engineering team in Oslo. It may just be the best AI supercomputing machinery in the world. Kudos to this powerhouse of a team, the dream machine has indeed been built.
Whilst being part of this powerhouse and Graphcore at large, I got to use these machines on a daily basis, with in the order of 256 IPUs unraveling knowledge all at once over and over again. At the same time, working with people from across the software stack allowed for ideas to easily bend to the IPU execution model. To be able to work like this, one felt the shaping of technology that expands the frontier of machine intelligence or computing in general happening under one’s fingertips (quite literally). This was a privilege.
I must also take this opportunity to thank my amazing colleagues across locations. Over the pandemic, we worked over Zoom, Slack, and code differentials across Oslo, London, Bristol, Cambridge, and more. It did not matter that some of us never met in person. What a privilege it has been to work alongside the kindest and smartest folk, ever receptive to my non-work constraints raising small kids (one only just about to start walking). Whilst I still, out of habit, often found myself working late into the night after kids slept, there were times it was getting extremely difficult to balance work and life. The support I got from my immediate (even if physically remote) colleagues was phenomenal. What a reliable and friendly bunch with an immaculate work ethic. Thank you very much.
When we did meet, outside of the pandemic and childcare, we made sure of maximising the fun that can be had on site.
As I leave Graphcore, I’d just like to end on a thought that everyone at the company is very much accustomed to – that a company is “a group of people who come together to share bread”4. We share the highs and the lows together. We work as a unit, driven in our case to create meaning out of machine intelligence. This is a tall order, and even if one or more fall in the process, we all still keep the mission going.
All the best Graphcore. My job here feels done. I have grown, and I carry many learnings with me, specially on how we worked as a team to carve out the frontier of computing to whereever I end up next. So long.
As for what I’ll be doing next – time may have the answer. I do not, just yet.
We were working out collaboration opportunities with the Norwegian Open AI Lab, and the AI academic/industry ecosystem (specially my previous workplace, Telenor Research) in Norway for over a year before I officially joined Graphcore. We did co-supervise Masters theses at the AI Lab, e.g. on parallelisation techniques for large models, and others. ↩
More on the safety part forthcoming in another post soon. ↩
See this excellent blog post on Automatic Loss Scaling on the IPU. ↩