Understanding the physical world is a critical skill that most people use effortlessly. However, this is still a challenge for artificial intelligence. If we want to develop safe and useful systems in the real world, we want these models to share our intuitive sense of physics. But before we can build these models, there is another challenge: How will we measure the ability of these models to understand the physical world? That is, what does it mean to understand the physical world and how can we quantify it?
Fortunately for us, developmental psychologists have spent decades studying what infants know about the natural world. In the process, they have mapped the nebulous concept of physical knowledge into a specific set of physical concepts. And, they have developed the violation of expectation (VoE) paradigm to test these concepts in infants.
In our paper published today in Nature Human Behavior, we extended their work and created the open source software Physical Concepts dataset. This synthetic video dataset conveys the VoE paradigm to assess five physical concepts: solidity, object persistence, continuity, “invariance,” and directional inertia.
With a benchmark for physical knowledge in hand, we turned to the task of building a model capable of learning about the physical world. Again, we looked to developmental psychologists for inspiration. The researchers not only documented what infants know about the natural world, but also the mechanisms that might activate this behavior. Despite the variability, these accounts are central to the notion of breaking the physical world into a whole objects that evolve over time.
Inspired by this work, we created a system we call PLATO (Physics Learning through Automatic Coding and Object Tracking). PLATO represents and explains the world as a set of objects. It makes predictions about where objects will be in the future based on where they have been in the past and what other objects they interact with.
After training PLATO on videos of simple physical interactions, we found that PLATO passed the tests on the Physical Concepts dataset. In addition, we trained “flat” models that were as large (or even larger) than PLATO, but did not use object-based representations. When we tested these models, we found that they did not pass all of our tests. This suggests that objects are useful for learning intuitive physics, supporting hypotheses from the developmental literature.
We also wanted to determine how much experience was needed to develop this skill. Evidence for natural cognition has been demonstrated in infants as young as two and a half months. How does PLATO compare? By varying the amount of training data that PLATO used, we found that PLATO could learn our physical concepts with just 28 hours of visual experience. The limited and synthetic nature of our data set means that we cannot make a similar comparison between the volume of visual experiences infants receive and PLATO. However, this result suggests that intuitive physics can be learned with relatively little experience if supported through an inductive bias for representing the world as objects.
Finally, we wanted to test PLATO’s ability to generalize. In the Physical Concepts dataset, all objects in our test set are also present in the training set. What if we tested PLATO with objects he had never seen before? To do this, we used a subset of another synthetic data set developed by researchers at MIT. This data set also explores physical cognition, albeit with different visual appearances and a set of objects that PLATO has never seen before. PLATO passed, without any retraining, even though he was tested on entirely new stimuli.
We hope that this dataset can provide researchers with a more concrete understanding of their model’s abilities to understand the physical world. In the future, this can be extended to test more aspects of intuitive physics by increasing the list of physical concepts tested and using richer visual stimuli, including new object shapes or even real-world videos.