About language, learning and tomatoes
Last week we explained the RescorlaWagner rule and set out to learn which fluid (or maybe a combination of both fluids) is most likely to yield tomatoes. We will calculate the first few learning weights by hand, to give you a good understanding of how the machinery works.
In the first trial (row 1 in Table 4) the cue pot is present, and the outcome was NO_TOMATO. This means that the second condition (2) from the three learning situations above is satisfied: there is positive evidence for the outcome, and we need to calculate the sum of weights \(\sum{w_{.j}}\) from the cue to the outcome. In the first step, this sum can only be zero because learning is only commencing. Next, still as per the formula in (2), we subtract this zero from one \((1  0)\), yielding 1, and multiply that resulting 1 with the learning rate (speed)  \(\gamma\), which we fix to a conveniently small value of 0.01. Hence, we calculate\(0.01 \times 1 = 0.01\). This \(0.01\) is the value of the connection weight between the pot and NO_TOMATO after the first update, as reflected in Table 5 below. For the remaining cues and outcomes, the situation is as follows. If red and blue liquids are not present, then condition (1) from the above listed learning situations applies and nothing happens. Because the cue pot is present on this trial, but the outcome TOMATO is not, we need to apply rule (3) for negative evidence or: \(\gamma(0  \sum{w_{kj}}) = 0.01(0  0) = 0\). In sum, all the remaining cells in the first row of Table 4 contain zeros.
In the second trial, the cues pot, red, blue, and the outcome TOMATO are present. For this, we need to apply rule (2): \(\Delta w^{t} = \gamma(1 \sum{w_{.j}})\), and then we also need to update: \(w_{ij}^{t+1} = w_{ij}^{t}+ \Delta w^{t}\). As this is the very first “appearance” of the outcome TOMATO, the sum of all weights (\(\sum{w_{.j}}\)) is zero, and thus, in the case of the cue pot, for example, we have: \(\gamma(1  \sum{w_{kj}}) = 0.01(1  0) = 0.01 \times 1 = 0.01\). The same applies to the two remaining cues, red and blue. We can see that after the second update, the three respective cells in the second row of the Table 5 (pot and TOMATO; red and TOMATO; blue and TOMATO), contain exactly those values. The outcome NO_TOMATO was not present on the second learning trial and that means we need to apply the negative evidence (or errorcorrecting) equation to capture the change: \(\Delta w^{t} =\gamma(0 \sum w_{.j})\), and then update the connection weights with: \(w_{ij}^{t+1} = w_{ij}^{t} + \Delta w^{t}\). Now, recall that NO_TOMATO has already been "experienced" on the first learning trial, which means that the sum of weights for the currently present cues (pot, red, blue) will not necessarily be equal to zero. Instead it is \(\sum{w_{.j} = 0.01}\), which is a consequence of having the pot present in the first trial and gaining a weight of 0.01 there. Our change thus becomes: \(\Delta w^{t} = \gamma(0 \sum{w_{.j}}) = 0.01(0  0.01) = 0.01 \times (0.01) = 0.0001\). And from that we can update the values for all cues as follows: for both the red and the blue, which have not previously occurred, the update is the same as their weights in the current step (\(w_{ij}^{t}\)), i.e. they have zero values, and we have: \(w_{ij}^{t+1} =w_{ij}^{t} + \Delta w^{t} = 0 + (0.0001) = 0.0001\); the pot gets: \(w_{ij}^{t+1} = w_{ij}^{t} + \Delta w^{t} = 0.01 + (0.0001) = 0.0099\). This is shown in the second row of Table 5.
We can now proceed and calculate the value of the weights after the third learning trial. The third learning trial combines pot and red paired with the outcome TOMATO. Let's get the sums of the weights (\(\sum{w_{.j}}\)) for the two possible outcomes, TOMATO and NO_TOMATO. We can get some help from Table 5 which does all the bookkeeping for us, having all weights aligned. The connection weight for both pot and red to TOMATO is now 0.01 after trial 2, which gives \(\sum{w_{.j}} = 0.01 + 0.01 = 0.02\). Analogously, the sum of weights for the outcome NO_TOMATO is \(\sum{w_{.j}} = 0.0099 0.0001 = 0.0098\). Now, given that in the third learning trial pot, red, and TOMATO are present, we need to apply the positive evidence update rule (2) in those cases, the no change update rule (1) for all combinations with the absence of blue liquid, and the negative evidence update rule (3) in all remaining cases (i.e., pot and NO_TOMATO, and red and NO_TOMATO). Let's list all results, separately:
Cue and Outcome
 Change: \(\Delta w^{t}=\gamma(1\sum(w_{.j}))\)
 Update: \(w_{ij}^{t+1}=w_{ij}^{t}+\Delta w^{t}\)

pot \(\rightarrow\) TOMATO
 \(0.01(10.02)=0.0098\)
 \(0.01+0.0098=0.0198\)

red \(\rightarrow\) TOMATO
 \(0.01(10.02)=0.0098\)
 \(0.01+0.0098=0.0198\)

pot \(\rightarrow\) NO_TOMATO
 \(0.01(00.0098)=0.000098\)
 \(0.00990.000098=0.009802\)

red \(\rightarrow\) NO_TOMATO
 \(0.01(00.0098)= 0.000098\)
 \(0.00010.000098=0.000198\)

These values are listed in Table 5, together will all the values for all other Trials.
Table 5 Trial  pot  red  blue  
Outcome  Outcome  Outcome  
TOMATO  NO_TOMATO  TOMATO  NO_TOMATO  TOMATO  NO_TOMATO  
1  0.0000  0.0100  0.0000  0.0000  0.0000  0.0000 
2  0.0100  0.0099  0.0100  0.0001  0.0100  0.0001 
3  0.0198  0.0098  0.0198  0.0002  0.0100  0.0001 
4  0.0194  0.0197  0.0194  0.0097  0.0100  0.0001 
5  0.0291  0.0195  0.0194  0.0097  0.0197  0.0003 
6  0.0286  0.0293  0.0194  0.0097  0.0192  0.0095 
7  0.0381  0.0289  0.0289  0.0093  0.0192  0.0095 
8  0.0376  0.0385  0.0289  0.0093  0.0186  0.0191 
9  0.0469  0.0381  0.0383  0.0088  0.0186  0.0191 
10  0.0561  0.0376  0.0474  0.0084  0.0186  0.0191 
11  0.0648  0.0369  0.0562  0.0077  0.0274  0.0185 
12  0.0739  0.0364  0.0562  0.0077  0.0365  0.0179 
13  0.0822  0.0358  0.0645  0.0071  0.0448  0.0173 
14  0.0810  0.0452  0.0645  0.0071  0.0436  0.0268 
15  0.0895  0.0447  0.0731  0.0066  0.0436  0.0268 
16  0.0882  0.0540  0.0731  0.0066  0.0422  0.0361 
17  0.0866  0.0634  0.0715  0.0160  0.0422  0.0361 
18  0.0953  0.0624  0.0715  0.0160  0.0509  0.0351 
19  0.0943  0.0718  0.0715  0.0160  0.0509  0.0351 
20  0.0929  0.0807  0.0715  0.0160  0.0495  0.0440 
Petar/Dagmar