Intuitive physics learning in a deep learning model inspired by developmental psychology

Understanding the physical world is a critical skill that most people use effortlessly. However, this is still a challenge for artificial intelligence. If we want to develop safe and useful systems in the real world, we want these models to share our intuitive sense of physics. But before we can build these models, there is another challenge: How will we measure the ability of these models to understand the physical world? That is, what does it mean to understand the physical world and how can we quantify it?

Fortunately for us, developmental psychologists have spent decades studying what infants know about the natural world. In the process, they have mapped the nebulous concept of physical knowledge into a specific set of physical concepts. And, they have developed the violation of expectation (VoE) paradigm to test these concepts in infants.

In our paper published today in Nature Human Behavior, we extended their work and created the open source software Physical Concepts dataset. This synthetic video dataset conveys the VoE paradigm to assess five physical concepts: solidity, object persistence, continuity, “invariance,” and directional inertia.

With a benchmark for physical knowledge in hand, we turned to the task of building a model capable of learning about the physical world. Again, we looked to developmental psychologists for inspiration. The researchers not only documented what infants know about the natural world, but also the mechanisms that might activate this behavior. Despite the variability, these accounts are central to the notion of breaking the physical world into a whole objects that evolve over time.

Inspired by this work, we created a system we call PLATO (Physics Learning through Automatic Coding and Object Tracking). PLATO represents and explains the world as a set of objects. It makes predictions about where objects will be in the future based on where they have been in the past and what other objects they interact with.

After training PLATO on videos of simple physical interactions, we found that PLATO passed the tests on the Physical Concepts dataset. In addition, we trained “flat” models that were as large (or even larger) than PLATO, but did not use object-based representations. When we tested these models, we found that they did not pass all of our tests. This suggests that objects are useful for learning intuitive physics, supporting hypotheses from the developmental literature.

We also wanted to determine how much experience was needed to develop this skill. Evidence for natural cognition has been demonstrated in infants as young as two and a half months. How does PLATO compare? By varying the amount of training data that PLATO used, we found that PLATO could learn our physical concepts with just 28 hours of visual experience. The limited and synthetic nature of our data set means that we cannot make a similar comparison between the volume of visual experiences infants receive and PLATO. However, this result suggests that intuitive physics can be learned with relatively little experience if supported through an inductive bias for representing the world as objects.

Finally, we wanted to test PLATO’s ability to generalize. In the Physical Concepts dataset, all objects in our test set are also present in the training set. What if we tested PLATO with objects he had never seen before? To do this, we used a subset of another synthetic data set developed by researchers at MIT. This data set also explores physical cognition, albeit with different visual appearances and a set of objects that PLATO has never seen before. PLATO passed, without any retraining, even though he was tested on entirely new stimuli.

We hope that this dataset can provide researchers with a more concrete understanding of their model’s abilities to understand the physical world. In the future, this can be extended to test more aspects of intuitive physics by increasing the list of physical concepts tested and using richer visual stimuli, including new object shapes or even real-world videos.

A way to let robots learn by listening will make them more useful

AI companies are finally being forced to cough up training data

NanoNets AI solution feeds delivery information to Jamix

Why harmonize bank statements? Explain the importance and benefits

Que sont les règles métier ? : The wizard is not complete

Understanding YOLOv5 Loss: A Comprehensive Analysis

Master Advanced Prompt Engineering with LangChain for Context-Aware Language Models

Arduino vs Raspberry Pi: What’s the difference?

Top 20 Generative AI Applications/ Use Cases Across Industries

Top 35+ Finance Interview Questions And Answers

Intuitive physics learning in a deep learning model inspired by developmental psychology

DataRobot: A Leader in the 2024 Gartner® Magic Quadrant™ for Data Science and Machine Learning Platforms

Researchers leverage shadows to model 3D scenes, including objects blocked from view | MIT News

What you need to know about this new Chinese text-to-video AI model

Nvidia’s ‘Nemotron-4 340B’ model redefines synthetic data generation, rivals GPT-4

Controlled diffusion model can change material properties in images | MIT News

A Deep Dive into In-Context Learning | by Aris Tsakpinis | May, 2024

A way to let robots learn by listening will make them more useful

How Forex Trading Robots Are Transforming Financial Markets

U.S. Awards $504 Million for ‘Tech Hubs’ in Overlooked Regions

Our Picks

A way to let robots learn by listening will make them more useful

How Forex Trading Robots Are Transforming Financial Markets

U.S. Awards $504 Million for ‘Tech Hubs’ in Overlooked Regions

Subscribe to Updates

Intuitive physics learning in a deep learning model inspired by developmental psychology

Related Posts