Event Sensors Bring Just the Right Data to Device Makers

Anatomically, the human eye is like a sophisticated tentacle that reaches out from the brain, with the retina acting as the tentacle's tip and touching everything the person sees. Evolution worked a wonder with this complex nervous structure.
Now, contrast the eye's anatomy to the engineering of the most widely used machine-vision systems today: a charge-coupled device (CCD) or a CMOS imaging chip, each of which consists of a grid of pixels. The eye is orders of magnitude more efficient than these flat-chipped computer-vision kits. Here's why: For any scene it observes, a chip's pixel grid is updated periodically-and in its entirety-over the course of receiving the light from the environment. The eye, though, is much more parsimonious, focusing its attention only on a small part of the visual scene at any one time-namely, the part of the scene that changes, like the fluttering of a leaf or a golf ball splashing into water.
My company, Prophesee, and our competitors call these changes in a scene events." And we call the biologically inspired, machine-vision systems built to capture these events neuromorphic event sensors. Compared to CCDs and CMOS imaging chips, event sensors respond faster, offer a higher dynamic range-meaning they can detect both in dark and bright parts of the scene at the same time-and capture quick movements without blur, all while producing new data only when and where an event is sensed, which makes the sensors highly energy and data efficient. We and others are using these biologically inspired supersensors to significantly upgrade a wide array of devices and machines, including high-dynamic-range cameras, augmented-reality wearables, drones, and medical robots.
So wherever you look at machines these days, they're starting to look back-and, thanks to event sensors, they're looking back more the way we do.
Event-sensing videos may seem unnatural to humans, but they capture just what computers need to know: motion.Prophesee
Event Sensors vs. CMOS Imaging ChipsDigital sensors inspired by the human eye date back decades. The first attempts to make them were in the 1980s at the California Institute of Technology. Pioneering electrical engineers Carver A. Mead, Misha Mahowald, and their colleagues used analog circuitry to mimic the functions of the excitable cells in the human retina, resulting in their silicon retina." In the 1990s, Mead cofounded Foveon to develop neurally inspired CMOS image sensors with improved color accuracy, less noise at low light, and sharper images. In 2008, camera maker Sigma purchased Foveon and continues to develop the technology for photography.
A number of research institutions continued to pursue bioinspired imaging technology through the 1990s and 2000s. In 2006, a team at the Institute of Neuroinformatics at the University of Zurich, built the first practical temporal-contrast event sensor, which captured changes in light intensity over time. By 2010, researchers at the Seville Institute of Microelectronics had designed sensors that could be tuned to detect changes in either space or time. Then, in 2010, my group at the Austrian Institute of Technology, in Vienna, combined temporal contrast detection with photocurrent integration at the pixel-level to both detect relative changes in intensity and acquire absolute light levels in each individual pixel . More recently, in 2022, a team at the Institut de la Vision, in Paris, and their spin-off, Pixium Vision, applied neuromorphic sensor technology to a biomedical application-a retinal implant to restore some vision to blind people. (Pixium has since been acquired by Science Corp., the Alameda, Calif.-based maker of brain-computer interfaces.)
RELATED: Bionic Eye Gets a New Lease on Life
Other startups that pioneered event sensors for real-world vision tasks include iniVation in Zurich (which merged with SynSense in China), CelePixel in Singapore (now part of OmniVision), and my company, Prophesee (formerly Chronocam), in Paris.
TABLE 1: Who's Developing Neuromorphic Event Sensors| Date released | Company | Sensor | Event pixel resolution | Status |
|---|---|---|---|---|
| 2023 | OmniVision | Celex VII | 1,032 x 928 | Prototype |
| 2023 | Prophesee | GenX320 | 320 x 320 | Commercial |
| 2023 | Sony | Gen3 | 1,920 x 1,084 | Prototype |
| 2021 | Prophesee & Sony | IMX636/637/646/647 | 1,280 x 720 | Commercial |
| 2020 | Samsung | Gen4 | 1,280 x 960 | Prototype |
| 2018 | Samsung | Gen3 | 640 x 480 | Commercial |
Among the leading CMOS image sensor companies, Samsung was the first to present its own event-sensor designs. Today other major players, such as Sony and OmniVision, are also exploring and implementing event sensors. Among the wide range of applications that companies are targeting are machine vision in cars, drone detection, blood-cell tracking, and robotic systems used in manufacturing.
How an Event Sensor WorksTo grasp the power of the event sensor, consider a conventional video camera recording a tennis ball crossing a court at 150 kilometers per hour. Depending on the camera, it will capture 24 to 60 frames per second, which can result in an undersampling of the fast motion due to large displacement of the ball between frames and possibly cause motion blur because of the movement of the ball during the exposure time. At the same time, the camera essentially oversamples the static background, such as the net and other parts of the court that don't move.
If you then ask a machine-vision system to analyze the dynamics in the scene, it has to rely on this sequence of static images-the video camera's frames-which contain both too little information about the important things and too much redundant information about things that don't matter. It's a fundamentally mismatched approach that's led the builders of machine-vision systems to invest in complex and power-hungry processing infrastructure to make up for the inadequate data. These machine-vision systems are too costly to use in applications that require real-time understanding of the scene, such as autonomous vehicles, and they use too much energy, bandwidth, and computing resources for applications like battery-powered smart glasses, drones, and robots.
Ideally, an image sensor would use high sampling rates for the parts of the scene that contain fast motion and changes, and slow rates for the slow-changing parts, with the sampling rate going to zero if nothing changes. This is exactly what an event sensor does. Each pixel acts independently and determines the timing of its own sampling by reacting to changes in the amount of incident light. The entire sampling process is no longer governed by a fixed clock with no relation to the scene's dynamics, as with conventional cameras, but instead adapts to subtle variations in the scene.

Let's dig deeper into the mechanics. When the light intensity on a given pixel crosses a predefined threshold, the system records the time with microsecond precision. This time stamp and the pixel's coordinates in the sensor array form a message describing the event," which the sensor transmits as a digital data package. Each pixel can do this without the need for an external intervention such as a clock signal and independently of the other pixels. Not only is this architecture vital for accurately capturing quick movements, but it's also critical for increasing an image's dynamic range. Since each pixel is independent, the lowest light in a scene and the brightest light in a scene are simultaneously recorded; there's no issue of over- or underexposed images.
The output generated by a video camera equipped with an event sensor is not a sequence of images but rather a continuous stream of individual pixel data, generated and transmitted based on changes happening in the scene. Since in many scenes, most pixels do not change very often, event sensors promise to save energy compared to conventional CMOS imaging, especially when you include the energy of data transmission and processing. For many tasks, our sensors consume about a tenth the power of a conventional sensor. Certain tasks, for example eye tracking for smart glasses, require even less energy for sensing and processing. In the case of the tennis ball, where the changes represent a small fraction of the overall field of vision, the data to be transmitted and processed is tiny compared to conventional sensors, and the advantages of an event sensor approach are enormous: perhaps five or even six orders of magnitude.
Event Sensors in ActionTo imagine where we will see event sensors in the future, think of any application that requires a fast, energy- and data-efficient camera that can work in both low and high light. For example, they would be ideal for edge devices: Internet-connected gadgets that are often small, have power constraints, are worn close to the body (such as a smart ring), or operate far from high-bandwidth, robust network connections (such as livestock monitors).
Event sensors' low power requirements and ability to detect subtle movement also make them ideal for human-computer interfaces-for example, in systems for eye and gaze tracking, lipreading, and gesture control in smartwatches, augmented-reality glasses, game controllers, and digital kiosks at fast food restaurants.
For the home, engineers are testing wall-mounted event sensors in health monitors for the elderly, to detect when a person falls. Here, event sensors have another advantage-they don't need to capture a full image, just the event of the fall. This means the monitor sends only an alert, and the use of a camera doesn't raise the usual privacy concerns.
Event sensors can also augment traditional digital photography. Such applications are still in the development stage, but researchers have demonstrated that when an event sensor is used alongside a phone's camera, the extra information about the motion within the scene as well as the high and low lighting from the event sensor can be used to remove blur from the original image, add more crispness, or boost the dynamic range.
Event sensors could be used to remove motion in the other direction, too: Currently, cameras rely on electromechanical stabilization technologies to keep the camera steady. Event-sensor data can be used to algorithmically produce a steady image in real time, even as the camera shakes. And because event sensors record data at microsecond intervals, faster than the fastest CCD or CMOS image sensors, it's also possible to fill in the gaps between the frames of traditional video capture. This can effectively boost the frame rate from tens of frames per second to tens of thousands, enabling ultraslow-motion video on demand after the recording has finished. Two obvious applications of this technique are helping referees at sporting events resolve questions right after a play, and helping authorities reconstruct the details of traffic collisions.
An event sensor records and sends data only when light changes more than a user-defined threshold. The size of the arrows in the video at right convey how fast different parts of the dancer and her dress are moving. Prophesee
Meanwhile, a wide range of early-stage inventors are developing applications of event sensors for situational awareness in space, including satellite and space-debris tracking. They're also investigating the use of event sensors for biological applications, including microfluidics analysis and flow visualization, flow cytometry, and contamination detection for cell therapy.
But right now, industrial applications of event sensors are the most mature. Companies have deployed them in quality control on beverage-carton production lines, in laser welding robots, and in Internet of Things devices. And developers are working on using event sensors to count objects on fast-moving conveyor belts, provide visual-feedback control for industrial robots, and to make touchless vibration measurements of equipment, for predictive maintenance.
The Data Challenge for Event SensorsThere is still work to be done to improve the capabilities of the technology. One of the biggest challenges is in the kind of data event sensors produce. Machine-vision systems use algorithms designed to interpret static scenes. Event data is temporal in nature, effectively capturing the swings of a robot arm or the spinning of a gear, but those distinct data signatures aren't easily parsed by current machine-vision systems.
Engineers can calibrate an event sensor to send a signal only when the number of photons changes more than a preset amount. This way, the sensor sends less, but more relevant, data. In this chart, only changes to the intensity [black curve] greater than a certain amount [dotted horizontal lines] set off an event message [blue or red, depending on the direction of the change]. Note that the y-axis is logarithmic and so the detected changes are relative changesProphesee
This is where Prophesee comes in. My company offers products and services that help other companies more easily build event-sensor technology into their applications. So we've been working on making it easier to incorporate temporal data into existing systems in three ways: by designing a new generation of event sensors with industry-standard interfaces and data protocols; by formatting the data for efficient use by a computer-vision algorithm or a neural network; and by providing always-on low-power mode capabilities. To this end, last year we partnered with chipmaker AMD to enable our Metavision HD event sensor to be used with AMD's Kria KV260 Vision AI Starter Kit, a collection of hardware and software that lets developers test their event-sensor applications. The Prophesee and AMD development platform manages some of the data challenges so that developers can experiment more freely with this new kind of camera.
One approach that we and others have found promising for managing the data of event sensors is to take a cue from the biologically inspired neural networks used in today's machine-learning architectures. For instance, spiking neural networks, or SNNs, act more like biological neurons than traditional neural networks do-specifically, SNNs transmit information only when discrete spikes" of activity are detected, while traditional neural nets process continuous values. SNNs thus offer an event-based computational approach that is well matched to the way that event sensors capture scene dynamics.
Another kind of neural network that's attracting attention is called a graph neural network, or GNN. These types of neural networks accept graphs as input data, which means they're useful for any kind of data that's represented by a mesh of nodes and their connections-for example, social networks, recommendation systems, molecular structures, and the behavior of biological and digital viruses. As it happens, the data that event sensors produce can also be represented by a graph that's 3D, where there are two dimensions of space and one dimension of time. The GNN can effectively compress the graph from an event sensor by picking out features such as 2D images, distinct types of objects, estimates of the direction and speed of objects, and even bodily gestures. We think GNNs will be especially useful for event-based edge-computing applications with limited power, connectivity, and processing. We're currently working to put a GNN almost directly into an event sensor and eventually to incorporate both the event sensor and the GNN process into the same millimeter-dimension chip.
In the future, we expect to see machine-vision systems that follow nature's successful strategy of capturing the right data at just the right time and processing it in the most efficient way. Ultimately, that approach will allow our machines to see the wider world in a new way, which will benefit both us and them.