I remember a long time ago seeing these incredibly eerie photos of completely empty city streets – no cars, people, anything. I can’t find the original source for it, but what the photographer did to create these was take time-lapse photographs of the street over a short time, and then take the median pixels from the set of photos to create an image without any people in it. The idea is that objects like cars and people are transient, and the “typical” median pixel represents the static thing in the scene (in this case, the buildings and streets).

I really liked this idea, and I wondered if something like this could be applied to videos. Could you have some filter that could turn people into ghosts?

Median Filters

In image processing, median filters are typically used as a way to get rid of “salt and pepper” noise. For every given pixel, you take the median of the set of pixels around it, with the hopes that you reject the outlier noise while keeping the underlying image intact.

An example of a median filter applied to an image. (Source: Wikipedia)
An example of a median filter applied to an image. (Source:
    Wikipedia)

With videos, the median can be taken along the time dimension to reduce noise as well, the assumption being that most motion in a video doesn’t happen too fast. When you take the median over a large period of time in this case, then you start rejecting transient objects in the scene as noise. This task is also known as background subtraction.

Finding the Median, Quick

To implement this, we’ll create a list for every pixel location in the video to keep track of the last few pixels that we want to pull the median from. As each frame comes into the filter, we’ll replace the oldest pixel with the newest one, and then find the median given our new list.

But because we’re looking for the median, this means we have to know the ordering of elements in it somehow. If you were to figure this out by sorting all of the lists, it would become expensive! For an HD video, you would have lists to sort! Even if you use a sorting algorithm that runs in time, it becomes restrictively expensive to use a filter size of any significant length.

Thankfully, there’s a way to find the th element of a list in linear time using the quickselect algorithm (which is closely related to quicksort!).

It works like the following: Let’s say we have an unsorted list of 7 elements and want to find the median, or the 4th sorted element in the list. Let’s start by taking the first element in the list, 7, and put everything less than 7 to the left of it, and everything greater to the right.

Now, even though we didn’t sort the list completely, we can now tell that 7 is actually the 5th element in our list! Now if we want to look for the 4th element, we need to look in the partition to the left of 7. Let’s repeat the process, this time using 5 as our pivot.

We still haven’t sorted the list, but we now know that 5 is our median! If you look at the steps we had to find this, it was far fewer than if we had to completely sort the list to find it.

Of course if you’re paying attention this means we can’t easily replace the oldest pixel with the newest. Instead, we’ll just alternate replacing the pixels on either end of each list with the new pixel to make it simple. Call it stochastic.

In C++, this algorithm can be called by using the STL container method, nth_element(). We can also make this faster by finding the median in several lists together in parallel, but I did this on my MacBook and the clang here doesn’t yet support OpenMP~

So what does it look like?

Well, let’s go for a walk:

For this video, the filter was given about two seconds of “memory,” meaning that it would consider all of the pixels from the last two seconds to pull the median from. Because the video is not stable, it has a really hazy, druggy sort of effect – you can make out what’s going on, but it’s very dream-like.

However, when the video is more stable, the effect is much more interesting. For example, here’s the filter applied to a few videos of some hectic streets in Saigon:

The people and vehicles in these videos turn into ghosts – people fade in and out, the bikes leave trails, reminiscent somewhat of the bikes in Akira. It’s very ghostly, and kind of mesmerizing to watch.

Finally, we can use this effect to take a neat selfie:

If you want to play more with this, the code is available here.