Reinforcement Learning Reading Group


When there seem to be very few people in one’s viscinity working to advance a field one is interested in, it can be a rather overwhelming place to be. But of course, one must not take this as a signal to abandon ship.

Scant general interest?

The reasons for seemingy scant general interest are many.

One is that the road to creating immedaite business value from the advances can be very unclear and costly. This is the case with most visionary ideas, and this is partly what academia and industry research labs are for, or so one would hope. Take for instance some of the pioneering work on training neural networks. It took until 2012 for advances from the early 1980s to go mainstream, thanks to more compute power, availability of more data, and further innovations by a relentless community to make training more stable.

Another is that the resources enabling progress in the field are disproportionally distributed across the worldwide talent pool, thereby creating severe disadvantages for some. This can drive the disadvantaged to focus their energies on something else, or move to places they feel enabled, unless one finds alternative ways of encouraging them.

Yet another reason is that you may be living in Norway. The general feeling of wellbeing over here incentivises playful individual pursuits, a la the world is one’s oyster. That is a great thing. But, if a hundred tinkerers tinker along thousand different directions, we get a lot of toys. If they were driven by a vision or even a pressing need, it could just as well be a solution to climate change, or for that matter making reinforcement learning (RL) go wild. Norway has oodles of talent. It is just missing that big pressing need to come together and reshape humanity. Most of it seems driven to taking up jobs as consultants. Steve Jobs showered some wisdom on consulting a while ago. Yet they go. Not for long I hope! To be fair, the ones who can drive/incentivise talent need to do a better job.

RL at the cusp of going wild

The basic RL machinery has been maturing over a few decades. Scalable approaches to RL driven decision making are advancing by leaps and bounds since 2013. It may, in a sense, seem like a field at the cusp of going wild. What makes RL peculiar in AI’s recent industry uswing is that we are at a stage where both basic pursuits and industry scale application are in fact being worked on at the same time. The basic pursuits will potentially lead to many orders of magnitude improvements in how agents collect and learn from experience. The push to apply RL from the industry further incentivises these advances. The tension in the field is very real and exciting.

To break or not to break out of the lab!

Scant general interest… revisited

This tension can however incentivise playing the waiting game. Since enough people in key industry research environments, e.g. Deepmind in collaboration with the wider Google, are trying to make RL go wild, can we not just wait until it happens? Let’s add this to the list of reasons for the seemingly scant interest. But for the sake of argument, even if one did wait, it is no guarantee that RL will go wild for one’s purpose. Sure, general algorithms are what we want, but people are not trying to solve your problem. Problems have peculiarities, some more severe than others, and not all of interest to everyone. Adaptations and tinkering are inevitable, at least in the short-mid term.

Reading group to drive breakthroughs

Furthermore, what about those who find a peculiar kind of satisfaction working on the subject — peculiar in the sense of an artist in unision with their process of creating art. They are not here to wait. They want to be part of the community nudging RL out of the lab. They certainly do not abandon ship. They embrace the tension. They care about pursuing the vision, and they are also not daunted by the lack of resources. Shouldn’t they be given a chance to rise to their potential?

Fortunately, since the field is so open, collaborating with the ones who have resources is of course an option. But to do so, one has to play at the same level. To this end, it certainly helps to have a local community which discusses, critiques, and appreciates ongoing RL research. Learning from each other, without a doubt, can kindle new ideas — ideas that move the field forward.

From follow up discussions with students after various lectures on the subject, it became clear that there was a need for such an outlet. Unsurprisingly, people who want to toy with the frontier of RL do exisit in Norway. Bringing them together clearly seemed like a good thing to do. Ergo, the Reinforcement Learning Reading Group. We are currently sampling physical spaces in Oslo (first two meetings happened at MESH). However, attending remotely via https://appear.in/machine-learning is indeed an option for those not in Oslo.

All welcome!