252-ps3

Computer Science 252
Neuromorphic Computing

Assignment 3: The Hopfield Net

Objectives

  1. Understand the Hopfield Network by coding it from scratch in Python.
  2. Understand the limitations of the network through empirical testing.

The good news is, despite the apparent complexity of the formulas we discussed, the actual algorithm for training and testing a Hopfield net is a lot simpler than for an SOM. Together with the NumPy skills you’ve accumulated, this means that the current assignment should be less challenging/frustrating than the previous one. In fact, if I don’t mention a new NumPy function, you can assume that you should use one you already know, perhaps in a slightly different way.

As usual, you’ll have a Python class (Hopfield) with some methods (__init__, learn, test), followed by a main section that uses it (I like the if __name__ == '__main__': idiom). So, let’s get started. Again as usual, we’ll focus on our test cases before even starting to implement the class.

Part 1: Generate some training data

In your main section, write a line of code that uses NumPy to generate an array of ones and zeros. Each row will represent an input vector. To keep things simple, make it five rows and 30 columns, so that you have a relatively small number of patterns (5) that are not to long to print out for debugging.

Part 2: Display a confusion matrix

In machine learning, a confusion matrix is a table that shows how well your classification algorithm has worked on every possible input. For example, if we’re trying to classify images of digits, such a table would have ten rows and ten columns, and would show how often the algorithm classified one digit as another. A perfect classification would have all positive entries on the diagonal, and all zeros elsewhere.

For this assignment we’re not doing classification, but we can still use the confusion matrix as a measure of success. Specifically, we will use the already-familiar vector cosine to see how well our Hopfield net recovers each pattern. So you should now write a function show_confusion that accepts two data arrays like the one you created in Part 1, and shows a matrix of the vector cosines of their respective rows. (For example, the third row, fourth column will show the vector cosine of the third vector in the first array with the fourth vector in the second array). To avoid big ugly floating-point printout, use the formatted print skills you learned in CSCI 111 to constrain the output to two decimal places. (If you don’t remember formatted printing, GIYF!)

How to test your confusion matrix function? Well, if you give it your random data array from Part 1, it should show something like this (think about why):

Part 2: Vector-cosine confusion matrix of an array with itself ----------------------

1.00 
0.59 1.00 
0.40 0.38 1.00 
0.49 0.63 0.45 1.00 
0.67 0.63 0.45 0.59 1.00 

Since the vectors are all non-negative, the largest possible cosine value is still 1, but the cosine between two random vectors is around 0.5 instead of 0.

Part 3: Noise it up!

To test the ability of our Hopfield net to recover degraded (noisy) patterns, we’ll need a function to add some noisy (random bit flips) to our data array. So write a function noisy_copy that accepts an array like the one from Part 1, as well as a probability between 0, and 1, and flips each bit in the array with that probability. (E.g., if probability is 0.5, then there’s a 50 percent chance that a bit will be flipped). If you’re good with NumPy, you can do this without an explicit loop; if not, feel free to write a loop. Either way, you’ll want to use numpy.copy to avoid clobbering the values in the original array, and you’ll want to test your noisy_copy function on small arrays at first. (A good test would be no change with a probability of 0, and fully changed with a probability of 1.)

Once you’ve got your noise function working, test it again by using your confusion-matrix function to show the confusion matrix for your original data array and various noisy copies of itself. As more noise is added, the values on the confusion-matrix diagonal should drop from 1 down to 0, with the non-diagonal values pretty much unchanged. For your final output on this part, use a noise value of 0.25. Here’s my output:

Part 3: Confusion matrix with 25 percent noise ------------------------------------

0.85 
0.57 0.91 
0.45 0.39 0.69 
0.59 0.65 0.63 0.79 
0.65 0.53 0.42 0.42 0.79 

Part 4: Code up your Hopfield net

Now that we’ve got a nice little test suite, it’s time to code up our Hopfield net. At the top of your script, create a class Hopfield with three methods:

  • A constructor that accepts the number of units n (which will also be the size of your input vectors), and builds an n×nNumPy array T of zeros, which are the initial network weights. You can use the numpy.zeros function to do this.
  • A learn method that accepts an array of input patterns like the array from Part 1. This method should loop over the rows of the array, modifying the weights T using the training formula on slide # 7 of the lecture notes. Be a Pythonista: do for a in data: to loop over the rows, rather than using range. For the products, rather than looping over the elements of each input pattern vector, you should use the numpy.outer function to compute the matrix of pairwise products, and then add this matrix to T using the formula on the slide. After you’re done looping over the patterns,  use numpy.fill_diagonal() to zero-out the elements on the diagonal of T.
  • A test method that accepts a single pattern (vector) and a number of iterations (default to a small value like five), and iteratively runs line 3 of the the third slide of the lecture notes. We’re cheating here by using a for loop, instead of a while loop based on the energy computation, but that’s okay: we care more about restoring the pattern than about the energy it takes.

Once you’ve coded up your Hopfield net, train it on the 5×30 training data you created above. Then test it by giving it a pattern from this set, to see how well it “recognizes” the pattern, using the vector cosine as a success criterion. Then make a noisy copy of your data, and see how well it restores one of the noisy patterns (compare recovered noisy with original clean). Your output should look something like this.

Part 4: Recovering small patterns with a Hopfield net -----------------------------

Recover pattern, no noise:
Input:  [1 0 1 0 1 1 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1]
Output: [1 0 1 0 1 1 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1]
Vector cosine = 1.00

Recover pattern, 25% noise:
Input:    [1 0 1 0 0 1 1 1 0 1 0 0 0 0 1 0 1 0 0 1 1 1 0 1 0 1 1 1 0 1]
Output:   [1 1 1 0 0 1 0 1 1 0 1 0 1 0 1 0 1 1 1 0 0 1 0 1 0 1 0 1 0 1]
Original: [1 0 1 0 1 1 1 1 0 1 1 1 0 0 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1]
Vector cosine = 0.71

Note the mediocre results I got on the noisy pattern. Sometimes it was recovered perfectly (cosine = 1.00), but often it was barely better than chance (cosine = 0.5). Sometimes it even failed to recover the non-noisy pattern!

Part 5: Improving the capacity

Personally, I was pretty disappointed with the results from Part 4. Even with a measly five patterns, the Hopfield net didn’t seem to live up to its reputation as a “cleanup memory” for noisy data.

As you may have suspected, a vector length of 30 is nice for debugging, but it’s way too small to work as an input to a Hopfield net. If you think about it, the reason is pretty clear: as the length of the vector increases by O(N), the number of weights increases by O(N2). So for a 30-input network, there are 900 weights, and the ratio of weights to inputs is 30:1. If we increase the pattern size to 1000, however, there are a million weights, so the ratio of weights to inputs goes way up, becoming 1000:1. Hence the bigger network is bringing a lot more resources to bear on representing the data, and should be able to store more patterns and recovery them more robustly.

To see this, repeat steps 3 – 5, but with 10 patterns of length 10000. As the confusion matrix will show, the vector cosines are about the same, but the network is much better at restoring noisy patterns. Indeed, as the following output shows, I was always able to recover all ten patterns perfectly with 25% noise:

Part 5: Recovering big patterns ----------------------------------------------------

Confusion matrix for 1000-element vectors with 25 percent noise:

0.77 
0.48 0.75 
0.49 0.48 0.75 
0.51 0.54 0.53 0.76 
0.52 0.52 0.51 0.51 0.75 
0.50 0.52 0.50 0.53 0.49 0.73 
0.54 0.55 0.51 0.52 0.55 0.54 0.76 
0.52 0.48 0.52 0.51 0.49 0.49 0.49 0.75 
0.54 0.50 0.50 0.49 0.53 0.53 0.51 0.51 0.75 
0.50 0.49 0.53 0.50 0.51 0.51 0.54 0.52 0.53 0.75 

Recovering patterns with 25 percent noise:

Vector cosine on pattern 0 = 1.00
Vector cosine on pattern 1 = 1.00
Vector cosine on pattern 2 = 1.00
Vector cosine on pattern 3 = 1.00
Vector cosine on pattern 4 = 1.00
Vector cosine on pattern 5 = 1.00
Vector cosine on pattern 6 = 1.00
Vector cosine on pattern 7 = 1.00
Vector cosine on pattern 8 = 1.00
Vector cosine on pattern 9 = 1.00

Extra-credit option

If you find Hopfield nets exciting, I feel sorry for you – excuse me, I mean, if you find Hopfield nets exciting, maybe you would like to try applying one to recovering images in the presence of noise. For example, a 32×32-pixel image can be “flattened” into a vector of length 1024 (around the same size as our successful network in Part 5). You can then train a Hopfield net on several such images, test on a noisy version of an image, reshape the test result to 32×32, and display the original, noisy, and restored versions for comparison. This can be done pretty easily with ASCII images (using a space for 0 and asterisk for 1), or with matplotlib if you’re up for it.

What to turn in to github

As usual, the only file you need to turn in is your final script (hopfield.py). As usual, I will test it thus:

  % python3 hopfield.py

Your output should go all the way from Part 2 through the end (Part 5, or extra credit). Use my output as a formatting guide.