I was at the AWS re:MARS conference in Las Vegas last week, and the theme of the week was how the combination of Machine Learning, Automation, and Robotics (sometimes in Space) will shape the future. Many people may think that the star of the show was Robert Downey Jr, but in my mind it’s Simulation & Reinforcement Learning, which showed up in nearly every keynote speech at the conference:
Day 1: Using Reinforcement Learning, Boston Dynamics robots have learned to do backflips, jump up onto ledges and lift data. Disney Imagineering has taken this to the next level with humanoid robots performing death-defying stunts.
Day 2: Amazon uses simulation to train models on difficult scenarios in their Go Stores. Amazon fulfillment center robots are trained to sort packages using Reinforcement Learning. Alexa uses simulated interactions to automatically learn the flow of conversations. Amazon Drone delivery uses simulated data to train the model for detecting people below the drone. Companies like Insitro have started to use RL to solve biomedicine problems by generating bio-interaction data.
Day 3: Andrew Ng calls out Meta Learning, where 100s of different simulators are used to build more generalizable Reinforcement Learning agents, as a “next big thing” in AI. Self driving car companies Zoox and Aurora use RL & meta-leraning to address the complexities of driving in urban environments. DexNet is attempting to build a massive dataset of 3D models by to help with grasping problems using simulations. Jeff Bezos agrees with Daphne Koller that RL bioengineering will huge in 10 years.
To summarize all of the above:
If the tasks in a field can be accurately simulated, Reinforcement Learning will dramatically improve the state of the art in that field over the next few years.
Where does physics come in?
If you read my earlier posts, you’ll know that my first daughter is four years old. This puts her firmly into the “Why” stage of life, where her brain shifts from simply learning tasks to wanting to understand everything about the world. Being the huge nerd that I am, I’ve always joked that I’ll be happy to take her all the way down the rabbit hole to physics every time she asks “why”. As it turns out, my parents were wrong, and there is a point where even a four-year-old gets bored asking “why?” and move on to something more interesting like coloring or pretending to read a book. Here is a typical exchange:
Created using http://cmx.io
What does any of that have to do with data science?
I recently watched Jeff Dean’s talk on the state of deep learning from this year’s Google I/O conference. He mentioned that neural networks had been trained to approximate the results of physics simulators and come to results 300,000x faster, with the potential for researchers to test 100M molecules over lunch.
Image Source: Jeff Dean presentation at Google I/O 2019
This is a huge advancement, as it should allow us to use the same reinforcement learning techniques that were the star of the show at re:MARS to address a huge new range of problems. Prior to these advancements, the cycle time of running a full physics simulator for each potential reward would simply take too long for RL to ever reach a reward state. Now, RL should be able to learn the physical properties of molecules that will optimize the intended properties for chemical engineers.
Given that everything can be reduced to physics, I am starting to imagine a world where it will be possible to build many more solutions from first principles. Going into the conference I had assumed that Biology was out of reach for simulations for years, but I just learned that there are already companies like Insitro starting to tackle this problem today.
Other recent developments should only serve to accelerate us to the future state where RL can be used in “higher level” sciences like psychology:
1. Raw compute power:
Google also released a private beta of their T3 TPU Pods, with over 100 Petaflops of processing power that is custom-built to run neural network training architectures. With that kind of power, tasks like materials analysis can be learned quickly. Additionally, Google has started using RL to design the chips themselves, which should lead to additional improvements over time.
2. Better reusability:
DeepMind is working on multi-layer network architecture where an initial RL agent selects the proper downstream network for a given task. This type of RL agent could be trained to break down a high level task into components and solve multiple tasks using transfer learning.
3. Better generalization:
The aforementioned MetaLearning techniques are improving the ability of RL agents to adapt to scenarios that they have not seen before.
4. Better optimization:
The Lottery Ticket Hypothesis paper from MIT has shown that Neural Networks can be further compressed by finding the “Winning Ticket” paths and then retraining using only these paths.
5. Better training data generation:
Interfaces like AutoCad’s Generative Design can help designers / engineers discover the specifications that need to be supplied so the RL agent moves in the right direction. Self driving car companies generate new training scenarios each time that a person has to take over.
What should you do about it?
First off, you should go learn about Reinforcement Learning: There are many great tutorials and courses that can teach you the intuition and inner workings of RL, so we’ll keep it brief here: RL agents get an interpreted sate of their environment, choose an action that affects the environment, observe new environmental state, and repeat the process. If that action leads to a positive outcome, the agent is given a reward, and is more likely to choose the same set of actions given similar states in the future.
This is repeated for many episodes, and eventually, the agent becomes very good at getting rewards (and therefore at the task we were training it for). One of the best ways to supplement this experience with something hands-on is to use the AWS Deep Racer, which is a scaled down race-car that provides a simulated environment, a RL training setup, and a physical piece of hardware corresponding to the simulation. You simply play with the reward function to train your racer agent.
Second, you should start actively searching ways to simulate systems in your business that could be optimized. Any existing simulators are an excellent place to start, but new simulators will likely be hugely impactful. Again AWS has a service called RoboMaker in this area, but there are many alternatives, most of which are based on Open API Gym
Finally, you should be on the lookout for new companies that are riding this technology wave. I expect that there will eventually be a series of open-source simulators developed to build upon each-other, with deep neural networks used to compress the key information that can be learned at each layer. Before that there will likely be proprietary solutions that leapfrog the state of the art in many fields. Over Time, this will unlock massive gains in scientifically based fields such as pharmaceuticals, material science, medicine, downstream rel="noopener noreferrer" oil & gas, and many others.
Article originally published on Ryan's Medium page.