**Computer Science 252**

**Neural Networks**

**Assignment 4: Sparse Distributed Memory**

**Objectives**

- Understand Kanerva’s Sparse Distributed Memory by coding it from scratch in Python.
- Reproduce the image-restoration results in Denning (1989)

As usual, you’ll have a Python class (`SDM`) with some methods (`__init__`, `learn`, `test`), followed by a main section that uses it (I like the `if __name__ == '__main__':` idiom). So, let’s get started. Again as usual, we’ll focus on our test cases before even starting to implement the class.

**Part 1: Write your display function**

Because SDM works with vectors, we will represent a square image as a flat vector. For example, the 16×16-pixel images in Denning (1989) will be represented as a vector of 256 zeros and ones. To display these vectors as square images, we will need a plotting function that accepts not just the vector, but the number of columns to plot. (Otherwise, it does not know that a 256-element vector represents a 16×16 image, or an 8×32 image, etc.)

So write a function `plot` that takes an array of zeros and ones, and the desired number of columns, and plots the array using asterisks for the ones and blank space for the zeros. As a test case, here is an example of me calling my `plot` function on a random bit vector, followed by the results:

testpat = np.random.random_integers(0, 1, 256) plot(testpat, 16) * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

As in previous assignments, you actual output should be clearly labeled `Part 1`, `Part 2`, etc.

**Part 2: One Ring to Rule Them All**

Now write a function `ring`. This function doesn’t have to take any inputs; it should just return a numpy array of 256 bits that, when plotted with your `plot` function, looks like a ring:

r = ring() plot(r, 16) * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

Rather than trying to devise an algorithm for generating a ring, you can just have your function return “hard-coded” values using `np.asarray`. Laying out the array one row at a time will make this easier. Simple illustration (with a much smaller array):

return np.asarray([0,1,0, 1,0,1, 0,1,0])

**Part 3: Code up your SDM**

Now that we have a nice visualization tool, it’s time to code up our actual network. Because we’re implementing just auto-associators for this assignment, your `__init__` method need only specify the number of patterns *p* and the vector size *n*. It should create a *p*×*n* random vector of zeros and ones for the addresses, and another vector of zeros of that size for the data.

To complete your SDM implementation, you’ll need to code up the `enter` and `lookup` methods. The `enter` method should accept a new address bit vector of length *n*, and use it both as the address and data for algorithm on page 8 of the lecture notes. Your `lookup` method should accept such a vector, use the algorithm on page 9 of the notes to produce a new data value, and return that data value.

As usual, we’ll first train and test our network on a pattern that we know should be restored perfectly: our clean ring pattern from the previous step. For this pattern the size *n* is of course 256, but what about the neighborhood size (“radius”) and the number of physical addresses? Recalling Kanerva’s advice about using a neighborhood of 451 for vectors of length 1,000, you can just go back to your constructor and have it store the value `0.451*n` as the radius. For the number of physical addresses, Kanerva’s recommended million is probably overkill for this problem. I got good results with just 2,000 (two thousand) physical addresses. As discussed by Denning, the “radius” metric can be implemented as *Hamming Distance*: the Hamming Distance between two vectors is simply the sum of the number of locations in which they differ. As usual, exploit the power of NumPy to implement your Hamming distance computation, without a loop. (*Hint*: you can sum over the results of a binary comparison.)

So now you can test your SDM: Enter the perfect ring vector, then use it as a key to retrieve a new vector. Plotting the new vector, you should see the identical image as the key you used to retrieve it.

**Part 4: Noise It Up!**

As you probably realize, the next step is to see how well our SDM cleans up a noisy version of our pattern. As you’ve done previously, write a little `noisy_copy` function that takes a vector and a probability, and returns a copy of the vector with bits flipped according to that probability. Then plot your noisy ring, use it as a key to your SDM `lookup` method, and plot the resulting vector. Your output should look something like this:

Part 4: Recover pattern after 25% noise added: * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

**Part 5: Solving Plato’s Problem**

How is it that we can have an abstract idea about a concept (like a shape), even if we’ve only ever scene imperfect versions of it? As Denning shows, this ancient conundrum can be addressed to a surprising extent by SDM. So for this concluding exercise we will train our SDM not with a single, ideal representation of a ring, but with several degraded / noisy versions, in the hope of seeing it *induce* from these an ideal form. We will likewise probe (`lookup`) with a noisy ring, to see how well the induction works.

SDM is pretty cool, but it’s not magic. So for this part we’ll use a more modest noise value of ten percent (0.1) to create the training patterns and the probe. Here’s what I got:

Part 5: Learn with the following five noisy examples * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Test with the following probe: * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Result: * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

**What to turn in to Sakai**

As usual, just a single script, **sdm.py**. Your script needs to display only Parts 4 and 5 above.

**Extra-credit options**

- Use matplotlib’s
`subplot`function to generate the multi-plot display shown in Denning’s second figure. - Augment your SDM class for heteroassociative memory; i.e., allow a user to specify the number of physical addresses, the size of the address vector, and the size of the data vector. You can then set up the auto-associator as a subclass, or use optional named arguments. Use your hetero-associator to tackle a problem similar to the image-storage problem above: for example, you could have image thumbnails (smaller copies) as you addresses, and full-size images as data, or vice-versa.

^{* I borrowed this name from a paper we will read soon.}