A New Kind of Computer Vision Can’t Be Tricked by Weird Lighting
Computer vision has come a long way since Imagenet, a large, open-source data set of labeled images, was released in 2009 for researchers to use to train AI—but images with tricky or bad lighting can still confuse algorithms. Researchers have either tried to employ hand-crafted rules about how light interacts with objects or used a data set that covers as many lighting situations as possible. But there is a nearly limitless combination of items and light in the real world, handicapping both approaches.
A new paper by researchers from MIT and DeepMind details a process that can identify images in different lighting without having to hand-code rules or train on a huge data set. The process, called a rendered intrinsics network (RIN), automatically separates an image into reflectance, shape, and lighting layers. It then recombines the layers into a reconstruction of the original image.
To train RIN, the researchers created a data set of five shapes—cubes, spheres, cones, cylinders, and toruses—and rendered each with 10 different orientations and 500 different colors. As a proof of concept, the researchers showed how breaking down an image into the three layers could help a computer identify what an item in an image is, or infer its shape. For example, the model learned to spot much more complicated items—like the classic image test models Stanford bunny, Utah teapot, and Blender’s Suzanne—after being trained on the basic sample shapes, without ever seeing labeled examples.
Beyond offering a new way to overcome the problem of infinite lighting situations for an image, RIN is also an example of learning with unlabeled data. Most AI still needs labeled data to learn, and preparing it takes hours of repetitive human labor. Finding a way to learn from unlabeled data is one of the next frontiers in artificial intelligence.
Keep Reading
Most Popular
Large language models can do jaw-dropping things. But nobody knows exactly why.
And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.
How scientists traced a mysterious covid case back to six toilets
When wastewater surveillance turns into a hunt for a single infected individual, the ethics get tricky.
The problem with plug-in hybrids? Their drivers.
Plug-in hybrids are often sold as a transition to EVs, but new data from Europe shows we’re still underestimating the emissions they produce.
Google DeepMind’s new generative model makes Super Mario–like games from scratch
Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.
Stay connected
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.