Article 3X4QE Leaky Fun For the Whole Family

Leaky Fun For the Whole Family

by
snoofle
from The Daily WTF on (#3X4QE)

Those of us that had the luxury of learning to program in C or other non-auto-gc'd langauges, learned early on the habit of writing the allocation and deallocation of a block of memory at the same time, and only then filling in the code in between afterward. This prevented those nasty I-forgot-to-free-it memory leaks.

512px-Cedar_point.jpg

Of course, that doesn't guarantee that memory can't ever leak; it just eliminates the more obvious sources of leakage.

Daniel worked on an installation for a theme park attraction sometime back. The task was to use a computer vision system to track the movement of a number of moving projection screens, and then send the tracking data to a video system in order to move the image around within the projection raster to keep it static on the screen.

He was using a specialised product designed for industrial machine-vision applications. It was essentially a camera with a small Windows XP machine inside, which ran a custom application developed using a commercial machine vision programming toolkit. The software looked for the position of 4 infra-red LEDs using the camera and then output their coordinates via ethernet to the video system.

After the installation was complete, Daniel was back in the home office when he got a call from the park. Apparently the camera software was crashing after around 10 days of uptime. He remoted in and saw the cause was an out of memory failure. This was his worst nightmare.

A nice feature of these boxes was that they had a mechanism to essentially 'lock out' changes from the hard drive. After the system was setup and working, an option was enabled which diverted all hard drive writes to RAM, discarding them on a reboot or power down. This had the advantage the camera system didn't require any software maintenance as it would be 'fresh' every time you turned it on, which is great for an installation that could be in place for over 10 years. However he was concerned about what this might mean for uptime as any process which repeatedly wrote any significant amount of data to the drive would quickly fill up the 1GB of RAM available on the cameras. Around 60% of this was used by the running process anyway, so there wasn't a huge amount of headroom.

This wasn't a large source of concern when developing the installation as there was an understanding that the system would be powered down every night when the park was closed (which would have been the easy solution to this problem). He'd noticed that the RAM was filling up slowly as the camera ran, but not at a rate which would be dangerous for the target uptime of 24 hours.

Unfortunately, the daily power cycle didn't happen, and any kind of system failure caused the on-site techs to get nervous; clearly this was an issue that would have to be fixed the hard way. He was convinced that the drive lock-out feature was the cause of this issue.

From then on, every 10 days, when the park was closed, Daniel remoted in and tried various steps on the cameras to find the process that was writing to "disk". The complete installation had 6 cameras so he would try different steps on each system to try and diagnose the issue, enabling/disabling various system processes and options within the machine vision development toolkit used to write the application. He would leave it for 10 days and then wait to hear from the onsite techs if the changes had been successful. These failures went on for around 2 months.

Finally in desperation, Daniel sent off the custom machine vision application to the company who developed the programming toolkit for their developers to analyze and see if they could point to the process causing the hard drive write.

Around a week later, they emailed back saying We couldn't find any hard drive write, but we did locate a small memory leak in one of your routines, around 8 bytes per image frame. The routine in question was in the main image analysis path. The cameras ran at 60 fps, so some quick arithmetic yields:

 8 bytes/frame * 60 fps * 86,400 sec/day * 10 days = 414,720,000 bytes = 0.41 gigabytes == total available memory

Aha; it was an old-fashioned memory leak after all!

otter-icon.png [Advertisement] Continuously monitor your servers for configuration changes, and report when there's configuration drift. Get started with Otter today! TheDailyWtf?d=yIl2AUoC8zAzUJZr0d42fI
External Content
Source RSS or Atom Feed
Feed Location http://syndication.thedailywtf.com/TheDailyWtf
Feed Title The Daily WTF
Feed Link http://thedailywtf.com/
Reply 0 comments