• Feature
  • Learning in Machines & Brains

How can we teach computers to see? A Q&A with Senior Fellow David Fleet

by CIFAR Jun 3 / 13

Artificial vision systems are enabling computers to interpret images and video with increasing efficacy. They have untold potential to change our everyday lives, from surveillance and autonomous vehicles to navigating vast amounts of information.

Image of an artificial visual system. (Credit: David Fleet)

Senior Fellow David Fleet of CIFAR`s Learning in Machines & Brains program (formerly known as Neural Computation & Adaptive Perception) gave a talk at a Knowledge Circle event in Toronto showing the recent advances in artificial intelligence. He recently spoke with CIFAR Exchange about his area of research.

Q. How do computers recognize images?

Image recognition has been considered an incredibly difficult unsolved problem for decades. In the last decade, researchers have recognized that there are distinctive properties of appearance (sometimes called visual features) that are characteristic of a particular object or type of object. We’ve now figured out how computers can identify these features and from this detect many types of objects in images. While progress has been encouraging we still have a long way to go.

Q. Why is machine learning so important for vision?

One of the key ways that we have been able to discover useful image features has been with the help of new machine learning algorithms that make it possible to sort through vast amounts of data – millions of images. We find that with computer vision, and other areas of computational perception, more data leads to improved performance.

Q. How does teaching a computer to see help us understand how a human brain functions?

Science relies on having hypotheses to test. For vision, we don’t yet have working theories, i.e., competent models that are capable of “seeing” in any way, let alone like humans. Engineering computers to see is one way to understand how to construct a theory of how an organism might see, and these can then be explored by psychologists and neuroscientists. For this reason, working on the boundary of engineering and neuroscience is fascinating.

Q. How do you foresee artificial perception will impact the lives of everyday people in the future?

Whenever you find it useful to see, it may be useful to have a computer helping out. This could mean cars that drive by themselves, or smart machines, robots or toys that can sense and understand their environments. Vision is remarkably powerful and general – once we know how to make machines see, they’ll be everywhere. And we’re already seeing this begin to happen – cameras that find faces, smart video games, visual systems to detect drowning people in swimming pools, assisted driving in cars, etc.

Q. How has CIFAR’s program in Learning in Machines & Brains contributed to advance in the field?

The CIFAR LMB program has focused on several key problems in machine learning and perception. Among them has been to try to figure out how to make computational models called deep neural networks learn effectively. Deep neural networks are, metaphorically, structured in a way that resembles the gross structure of the brain, with layers of processing units (cells) beginning with sensors and ending with sophisticated perception, cognition and action. Understanding how such structures could be trained (ie learning) has been one of the major questions on which the LMB program has focused. Recent successes on large datasets have been amazing, and are now poised to have a significant impact, not just in research labs but in products like Google’s image-based search.