#Hopfield Net
 So far, neural networks for computation are all feedforward structures
#Loopy network
 Each neuron is a perceptron with +1/1 output
 Every neuron receives input from every other neuron
 Every neuron outputs signals to every other neuron
 At each time each neuron receives a βfieldβ $\sum_{j \neq i} w_{j i} y_{j}+b_{i}$
 If the sign of the field matches its own sign, it does not respond
 If the sign of the field opposes its own sign, it βflipsβ to match the sign of the field
 If the sign of the field at any neuron opposes its own sign, it βflipsβ to match the field
 Which will change the field at other nodes
 Which may then flip... and so on...
#Filp behavior

Let $y^{}_{i}$ be the output of the $i$th neuron just before it responds to the current field

Let $y_{i}^{+}$ be the output of the $i$th neuron just after it responds to the current field

if $y_{i}^{}=\operatorname{sign}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)$, then $y_{i}^{+} = y_{i}^{}$

If the sign of the field matches its own sign, it does not flip

$$ y_{i}^{+}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)y_{i}^{}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)=0 $$


if $y_{i}^{}\neq\operatorname{sign}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)$, then $y_{i}^{+} = y_{i}^{}$

$$ y_{i}^{+}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)y_{i}^{}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)=2 y_{i}^{+}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right) $$

This term is always positive!


Every flip of a neuron is guaranteed to locally increase $y_{i}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)$
#Globally
 Consider the following sum across all nodes
$$
\begin{array}{c}
D\left(y_{1}, y_{2}, \ldots, y_{N}\right)=\sum_{i} y_{i}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right) \\
=\sum_{i, j \neq i} w_{i j} y_{i} y_{j}+\sum_{i} b_{i} y_{i}
\end{array}
$$
 Assume $w_{ii} = 0$
 For any unit $k$ that βflipsβ because of the local field
$$ \Delta D\left(y_{k}\right)=D\left(y_{1}, \ldots, y_{k}^{+}, \ldots, y_{N}\right)D\left(y_{1}, \ldots, y_{k}^{}, \ldots, y_{N}\right) $$
$$ \Delta D\left(y_{k}\right)=\left(y_{k}^{+}y_{k}^{}\right)\left(\sum_{j \neq k} w_{j k} y_{j}+b_{k}\right) $$
 This is always positive!
 Every flip of a unit results in an increase in $D$
#Overall
 Flipping a unit will result in an increase (nondecrease) of
$$ D=\sum_{i, j \neq i} w_{i j} y_{i} y_{j}+\sum_{i} b_{i} y_{i} $$
 $D$ is bounded
$$ D_{\max }=\sum_{i, j \neq i}\leftw_{i j}\right+\sum_{i}\leftb_{i}\right $$
 The minimum increment of $D$ in a flip is
$$ \Delta D_{\min }=\min _{i,{y_{i}, i=1 . \ldots N}} 2\sum_{j \neq i} w_{j i} y_{j}+b_{i} $$
 Any sequence of flips must converge in a finite number of steps
 Think of this as an infinite deep network where every weights at every layers are identical
 Find the maximum layer!
#The Energy of a Hopfield Net
 Define the Energy of the network as
$$ E=\sum_{i, j \neq i} w_{i j} y_{i} y_{j}\sum_{i} b_{i} y_{i} $$
 Just the negative of $D$
 The evolution of a Hopfield network constantly decreases its energy
 This is analogous to the potential energy of a spin glass(Magnetic diploes)
 The system will evolve until the energy hits a local minimum
 We remove bias for better understanding
 The network will evolve until it arrives at a local minimum in the energy contour
#Contentaddressable memory
 Each of the minima is a βstoredβ pattern
 If the network is initialized close to a stored pattern, it will inevitably evolve to the pattern
 This is a content addressable memory
 Recall memory content from partial or corrupt values
 Also called associative memory
 Evolve and recall pattern by content, not by location
#Evolution
 The network will evolve until it arrives at a local minimum in the energy contour
 We proved that every change in the network will result in decrease in energy
 So path to energy minimum is monotonic
#For 2neuron net
 Symmetric
 $\frac{1}{2} \mathbf{y}^{T} \mathbf{W} \mathbf{y}=\frac{1}{2}(\mathbf{y})^{T} \mathbf{W}(\mathbf{y})$
 If $\hat{y}$ is a local minimum, so is $\hat{y}$
#Computational algorithm
 Very simple
 Updates can be done sequentially, or all at once
 Convergence when it deos not chage significantly any more
$$ E=\sum_{i} \sum_{j>i} w_{j i} y_{j} y_{i} $$
#Issues
#Store a specific pattern
 A network can store multiple patterns
 Every stable point is a stored pattern
 So we could design the net to store multiple patterns
 Remember that every stored pattern $P$ is actually two stored patterns, $P$ and $P$
 How could the quadrtic function have multiple minimum? (Convex function)
 Input has constrain (belong to $(1,1)$ )
 Hebbian learning: $w_{j i}=y_{j} y_{i}$
 Design a stationary pattern
 $\operatorname{sign}\left(\sum_{j \neq i} w_{j i} y_{j}\right)=y_{i} \quad \forall i$
 So
 $\operatorname{sign}\left(\sum_{j \neq i} w_{j i} y_{j}\right)=\operatorname{sign}\left(\sum_{j \neq i} y_{j} y_{i} y_{j}\right)$
 $\quad=\operatorname{sign}\left(\sum_{j \neq i} y_{j}^{2} y_{i}\right)=\operatorname{sign}\left(y_{i}\right)=y_{i}$
 Energy
 $\begin{aligned} E=&\sum_{i} \sum_{j<i} w_{j i} y_{j} y_{i}=\sum_{i} \sum_{j<i} y_{i}^{2} y_{j}^{2} \\ &=\sum_{i} \sum_{j<i} 1=0.5 N(N1) \end{aligned}$
 This is the lowest possible energy value for the network
 Stored pattern has lowest energy
 No matter where it begin, it will evolve into yellow pattern(lowest energy)
#How many patterns can we store?
 To store more than one pattern
$$ w_{j i}=\sum_{\mathbf{y}_{p} \in\left{\mathbf{y}_{p}\right}} y_{i}^{p} y_{j}^{p} $$
 ${y_P}$ is the set of patterns to store
 Super/subscript $p$ represents the specific pattern
 Hopfield: For a network of neurons can store up to ~$0.15N$ patterns through Hebbian learning(Provided in PPT)
#Orthogonal/ Nonorthogonal patterns
 Orthogonal patterns

Patterns are local minima (stationary and stable)
 No other local minima exist
 But patterns perfectly confusable for recall
 Nonorthogonal patterns
 Patterns are local minima (stationary and stable)
 No other local minima exist
 Actual wells for patterns
 Patterns may be perfectly recalled! (Note K > 0.14 N)
 No other local minima exist
 Two orthogonal 6bit patterns
 Perfectly stationary and stable
 Several spurious βfakememoryβ local minima..
#Observations

Many βparasiticβ patterns
 Undesired patterns that also become stable or attractors

Patterns that are nonorthogonal easier to remember
 I.e. patterns that are closer are easier to remember than patterns that are farther!!

Seems possible to store K > 0.14N patterns
 i.e. obtain a weight matrix W such that K > 0.14N patterns are stationary
 Possible to make more than 0.14N patterns atleast 1bit stable