252-ps5

Computer Science 252
Neural Networks

Assignment 5: Image Convolution and Optical Flow

Objectives

    1.  Understand the value of optical flow, by using an existing Python package to determine direction of motion with a webcam.
    2. Experience image convolution first-hand, by coding up some convolution kernels yourself.

Part 1: Trying out Optical Flow

For this exercise you will work in a team with another student (or more, depending on how many cameras we can get working), in Parmly 413 (Advanced Lab), sharing a webcam and Linux workstation. So plug your webcam into one of the USB ports on the side of your monitor, and let’s begin.

To get started, open a terminal window and issue the following command (copy-paste is best; note that the percent-sign represents the Unix prompt and should not be included in your command):

% git clone https://github.com/simondlevy/OpenCV-Python-Hacks

Normally we’d like to use IDLE3 for these assignments, but in this case we won’t, because IDLE interacts poorly with OpenCV, making it difficult to close your display window when you’re done. So, instead, in your terminal window, do the following:

    % cd OpenCV-Python-Hacks
    % python3 showflow.py

You should see a camera display pop up, along with a grid of green dots that will turn into lines when you move the camera. If the display doesn’t pop up quickly, there’s probably something wrong with the camera. In which case, let me know, and we’ll arrange for you to get one that’s working.

As you can probably see, the lines are the optical flow arrows from Geof Barrows’ article and similar presentations on the topic. For the most consistent effect, hold the camera facing downward toward a richly-textured scene like the cushion on a chair, and slide it back and forth, left and right. (Optical flow often works poorly over an undifferentiated surface like a linoleum floor.) You will also see that the response to camera motion is pretty sluggish, lagging behind the actual movement of the camera. Hit the Esc key to quit, and you’ll see a little report confirming a pretty weak frames-per-second (FPS) processing rate. (Digital cameras typically display 30 FPS or higher.)

Fortunately, the showflow script provides an option for speedup: you can simply scale down the image to reduce the number of pixels that the optical-flow algorithm has to process. For example, scaling down by half reduces the number of pixels from 640×480 = 307,200 to 320×240 = 76,800, a factor of four reduction. You can do it thus with a command-line option:

    % python3 showflow.py -s 2

By scaling down the image this way I was able to get the speed to between 8 and 10 FPS: not very impressive, but good enough to try some experiments.

Part 2: Displaying Optical Flow Information

The optical_flow package used by showflow uses a standard Python packaging convention: the actual code is contained in an __init__.py file inside the folder with the package’s name. Using IDLE or your favorite editor, open up optical_flow/__init__.py and look at the OpticalFlowCalculator class; specifically, the processFrame method. As you can see, this method takes a frame (OpenCV’s term for an image) and returns the average flow in the x and y directions, as well as a new frame (image) containing the flow lines for display.

Your job for this part is to modify the processFrame method to display useful information about the optical flow signal. Minimally, you should be able to use the average x and y flow values to print out messages like LEFT, RIGHT, UP, and DOWN, to report the primary direction in which the camera is moving over the textured surface. If there’s no movement to report, you can just skip printing anything.

Once you get that working, see whether you can use the values computed in the nested for loop to do a more advanced analysis of the image. Dr. Barrows showed us examples of how an animal can use optical flow to determine whether it is moving TOWARD or AWAY from the image. So getting those two outputs working would be an interesting goal. Real-time image processing, like much of robotics, can be challenging and even frustrating. So don’t feel bad if you can’t get this part to work consistently.

Part 3: Image convolution

A nice feature of matplotlib that we haven’t explored yet is its ability to load, analyze, and display images. The example below shows how to get started with this. I used an image of a cat, but you should use whatever image you like:

import matplotlib.image as img
import matplotlib.pyplot as plt

image = img.imread('cat.jpg').copy()

print(image.shape)

plt.imshow(image)
plt.show()

Putting this code in a script convolution.py and running it, you should see your image appear, as well as a tuple of values representing the size of the image. For example, Python reported my cat image as (720, 858, 3), telling me that it is 720 pixels high (rows), 858 pixels wide (columns), and contains three channels, where each channel represents one color component (Red, Green, Blue) of the image. We NumPy programmers can think of this as a stack of three 720×858 matrices (arrays). To see this, insert the following piece of code before plt.imshow(image):

image[:,:,0] = 0

You should now see a blue-green version of the image, because all the red values (in the first image plane) have been set to zero. (You should remove this line once you’ve seen its effect!)

Now that we understand how images work in matplotlib, it’s time to try some convolutions. At the top of your convolution.py script, add a function showconvo. This function should accept the image object you loaded via imread, as well as an array of nine values representing a convolution kernel. The goal is to complete the function so that it displays the original image, generates a new image by running the convolution on the image with the kernel you give it, and displays the new image below the original.

Now, we know that we can display supbplots using plt.subplot (see the lecture notes). What about the convolution? Well, if you google numpy image convolution you’ll find some rather complicated examples of people doing it one image plane at time. Fortunately for us, the same package we used for optical flow (OpenCV, via import cv2), has a powerful built-in function filter2D that will do the convolution for us in a single line of code, for all three image planes at once. This tutorial has a nice little example of how to call the function on your image and kernel. So your job will be to adapt that code to work with your existing showconvo function (hint: all you need is np.asarray to make your kernel). Once you’re ready to test your completed showconvo function, try passing it two or three of the kernels from the wikipedia page. Finally, allow showconvo to accept a title as well, showing what kind of kernel you used. For example, the following code

    showconvo(image, np.asarray([[0,-1,0],[-1,5,-1],[0,-1,0]]), 'Sharpen')

gave me this plot:

If you can’t get OpenCV working on your computer (it’s a finicky package!), you can fall back on the code in the Homebrew Convolution with NumPy slide from our lecture notes:

    1. Pass in the kernel as a flat array; e.g.,                                                                                             showconvo(image, np.asarray([0,-1,0,-1,5,-1,0,-1,0]), ‘Sharpen’)
    2. In your showconvo function, break up the image into its R,G, and B planes, convolve each plane separately, reconstruct them using numpy.dstack, and turn the floating-point image back into integers before displaying:                   
      r, g, b = image[:,:,0], image[:,:,1], image[:,:,2]                                        cr, cg, cb = convolve(r, kernel), convolve(g, kernel), convolve(b, kernel)   

      plt.imshow(np.dstack((cr, cg, cb)).astype(‘uint8’))

Experiment a little with the kernel values, modifying them as needed to get two dramatically different effects (like sharpening and blurring). Then have your convolution.py script call showconvo twice, so that I’ll see one plot like the above, close it, then see the other.

What to submit to github

For maximum convenience (for you and me), create and an assignment5 folder containing everything: your convolution.py script, the image file that goes with it, and the entire OpenCV-Python-Hacks folder containing your modified __init__.py. For the optical-flow part, its fine for all members of your team to submit the same code, but for image convolution you should work independently.