Multiple Instantiation and Rule Mediation in SHRUTI



Multiple Instantiation and Rule Mediation in SHRUTI

Carter Wendelken 1,2 and Lokendra Shastri 1

1 International Computer Science Institute, Berkeley, CA

2 Center for Mind and Brain, UC-Davis

cwendelken@ucdavis.edu shastri@icsi.berkeley.edu

phone: 510-388-0405, fax: 530-792-1489

Abstract

The shruti model demonstrates how a neurally motivated connectionist architecture can support encoding of complex evidential and relational knowledge and perform rapid inference and decision-making with that knowledge. Key to the successful operation of the shruti system is its encoding of causal rules and its capacity to handle multiple instantiation of relational predicates. We present here a detailed description of the implementation of causal rules, and also of the manner in which multiple instantiation is achieved, within the current shruti architecture. The implementation described here represents an improvement over earlier realizations of similar functionality. In particular, it solves the multiple instantiation problem using a significantly simpler connectionist circuit.

1. Introduction

The shruti model demonstrates how a neurally motivated connectionist architecture can support encoding of complex evidential and relational knowledge and perform rapid inference and decision-making with that knowledge (Shastri and Ajjanagadde 1993; Shastri 1999; Wendelken 2003). Key to the successful operation of the shruti system is its encoding of causal rules and its capacity to handle multiple instantiation of relational predicates. In this article, we present a detailed description of the implementation of causal rules, and also of the manner in which multiple instantiation is achieved, within the current shruti architecture. The implementation described here represents an improvement over earlier realizations of similar functionality. In particular, it solves the multiple instantiation problem using a significantly simpler connectionist circuit.

The SHRUTI architecture is a realization of two key ideas: representation via focal clusters and temporal synchrony variable binding. Focal clusters are groups of nodes with varying function subserving a common representation; they are employed in the representation of types, entities, relations, events, actions, goals, and causal rules. Figure 1 depicts focal clusters for the relational predicates buys and owns, for the types Agent and Person, for the entity Tom, and for the rule buys(x,y) ( owns(x,y). Within a relational focal cluster, collector nodes (+ and -) represent belief and disbelief, enabler nodes (?) represent querying of information, and role nodes (x and y) represent predicate roles; activity of the +:buys node would indicate that some instance of the buys relation is believed to be true. Different nodes within a type or entity cluster represent different states of belief (+ or -) and quantification (e = existential, v = universal); for example, activity of ?v:Person in Figure 1 indicates that a query is being posed about all Persons. The dynamic link between a role node (such as buys:x) and a particular type or entity (such as Tom) is established via synchronous firing of the role node and the type node. Circles depict phasic nodes that can fire periodically (within a target phase) and are capable of synchronization, while pentagons depict nodes that fire in continuous bursts. In the implemented model, temporal synchrony binding is established at the time a query is posed or an assertion is made and is maintained throughout an episode of inference. Each entity occupies its own phase. The instantiation of new temporal phases to represent new (internally generated) entities occurs in some situations and is described in Section 2.1. In Figure 1, synchronous firing of Tom:? and owns:x, along with activity of the owns:? enabler node, represents the query “Does Tom own something?”. Spreading activation within the type hierarchy transforms this into the query “Do all agents own something?”, while spreading activation in the causal rule network transforms it into the query “Did Tom/all Persons/all Agents buy something?”. This query could potentially match a long-term episodic fact (e.g. “Tom bought a car”) or statistical knowledge (e.g. “Most people own clothes.”) encoded as simple connectionist circuits (not shown), thus resulting in activation a collector nodes in both relational and type clusters and representation of a particular belief (e.g. “Tom probably owns clothes.”)

[pic]

Figure 1: Depiction of a simple causal rule and type hierarchy. Only some of the key nodes and links are shown.

The nodes used in the shruti model are computational abstractions; in a more detailed implementation of focal clusters, each node is realized as a small ensemble of cells (Mani and Shastri 1993). The basic focal cluster structure is assumed to represent a recurring pattern of neural connectivity that could be an outcome of genetically driven early development (see (Marcus 2001)). Learning new causal rules and relations involves recruitment of uncommitted focal clusters to stand for specific relational predicates and rules (Wendelken and Shastri 2003). The learning new events and situations by recruitment learning has been studied in depth within a biologically detailed model of episodic memory formation via cortico-hippocampal interactions (Shastri, 2002; Shastri, submitted).

2. The rule mediator

A central feature of the encoding of a causal rule in shruti is the presence of a rule mediator cluster that controls the propagation of activity between antecedent and consequent predicates.

The basic rule mediator consists of a single collector node, an enabler node, and forward and backward role nodes for each variable involved in the associated rule. Circuits attached to the rule mediator structure are responsible for instantiating unbound variables, enforcing type restrictions, and detecting mismatched role activity. The implementation of these functional circuits largely distinguishes the mediator structure discussed here from the previous version described in (Shastri 1999).

[pic]

Figure 2: Structure of a rule mediator connecting one antecedent A(y,z) to one consequent C(x,y). Antecedent nodes are shown at the top, consequent nodes are shown along the bottom, and mediator nodes are depicted in the middle. Functional circuits support role instantiation (INS), type restriction (TR), and role mismatch detection (DET).

Figure 2 illustrates the basic structure of a rule mediator connecting one antecedent to one consequent for a rule of the form:

( x:X, y:Y, z:Z A(y,z) ( C(x,y)

To make this more concrete, consider the rule:

( y:Animate, z:Location fall(y,z) ( hurt(y)

which says that when an animate thing falls somewhere, it tends to get hurt. This rule can be mapped to Figure 2 with the exception that the extra variable x in the consequent is excluded. When fall(John,Hallway) is asserted, activity should be propagated through the rule mediator to the focal cluster for hurt, leading to the assertion hurt(John). Specifically, activity from the antecedent collector propagates to the mediator collector and, when not inhibited, from there to the consequent collector. Activity from each antecedent role node propagates to the associated forward mediator role node (e.g. xf in Figure 2) and from there to the appropriate consequent role. If instead the query hurt(John)? is posed to the system, then activity propagating through the mediator should lead to the query fall(John,Location)? In this case, the mediator enabler is involved in place of the mediator collector, and backward mediator role nodes are utilized instead of their forward counterparts.

1. Instantiating unbound variables

In our example, the antecedent fall has one variable (labeled z) that is not found in the consequent hurt. This means that whenever the system propagates activity from hurt to fall, it must come up with a new phase to represent the previously unmentioned location variable. In order to accomplish this, a role instantiation circuit is introduced into the mediator structure. The requisite (backward-direction) role instantiation node, shown to the right in Figure 2 as a box with a curved top and labeled zbINS, is triggered whenever the mediator enabler is active over several cycles and the associated (backward) mediator role node is not. When triggered, the instantiator node becomes active in an unoccupied phase and passes along this new phasic activity to both the mediator role node and the associated type. (In particular, the type node Z:e? with existential query quantification is activated by the backwards instantiator). A link to the instantiator node from the role node associated with forward propagation allows the instantiator node to reuse a phase that is already active in the forward direction. When the system queries hurt(John)?, propagation of activity through the mediator to the fall focal cluster and to the type hierarchy leads to the query fall(John,?e:Location)? (``Did John fall in some location?''). Figure 2 also shows a forward role instantiation circuit for the variable x. This circuit is essentially the same as the backward version except that it is activated by the mediator collector instead of enabler and it sends output to positive existentially quantified nodes of the type hierarchy (e.g. X:e+) and to the forward mediator role node.

2. Enforcing type restrictions

One of the most important functions of the rule mediator structure is to ensure that activity is only propagated if roles are active in synchrony with appropriate types. This ensures, for example, that activity at falls leads to activity at hurts only if the faller role is filled by some kind of Animate thing. The forward type-checking circuitry consists of a type restrictor node (triangle labeled yf TR in Figure 2) which receives input from the antecedent role node and from an associated (positive) type node, and inhibits both the forward role node and the mediator collector. The backward type-checking circuit (centered on node yb TR) involves connections with mediator enabler, backward role node, and a universal query type node. The type restrictor node becomes active and prevents propagation of activity through the mediator whenever, at any time during a cycle, the role input is active without accompanying type input activity. The type-restriction circuitry effectively prevents any activity that is inconsistent with the called-for types from passing through the mediator. Spreading activation in the type hierarchy ensures that type inheritance is taken into account.

2.3 Detecting mismatched roles

Another job of the rule mediator is to make sure that a focal cluster that already represents some relational instance is not corrupted by incongruous activity. In order to do this, it must compare all incoming role activity to any existing downstream role activity and prevent propagation whenever a temporal phase mismatch is found. This task is handled by role detector nodes (shown in Figure 2 as trapezoids yf DET and yb DET) which compare inputs from corresponding antecedent and consequent role nodes and prevent propagation whenever they are active in different phases.

3. Multiple instantiation

Suppose that the system observes both fall(John,Hallway) and fall(Mary,Yard). As described here so far, it could not simultaneously represent both instances of the fall predicate.

This would be a trivial issue for a symbolic system, but multiple instantiation can be a significant problem for a connectionist network like shruti that relies on activation of representational structure in its long-term memory. The basic solution within the shruti framework has been to replicate each focal cluster a small number of times, with each replication termed a bank of the focal cluster. The number of replications can vary depending on the importance of the cluster.

Significant difficulty arises, however, when we actually want to make use of the multiple banks to do inference. The original implementation of shruti included a complex switching mechanism tied to each focal cluster that served to assign incoming activity to an appropriate bank (Mani and Shastri 1993). A significantly improved mechanism, built around the rule mediator, has recently been developed. The current mechanism relies heavily on the filtering properties of the rule mediator to prevent inappropriate bank assignments. Figure 3 illustrates the basic structure involved in propagating activity between multiple-bank focal clusters. For a rule that consists of N antecedents A1 through AN and M consequents C1 through CM, where each antecedent or consequent cluster X has Xk banks, the rule mediator structure interposed between antecedent and consequent has a number ( of mediator banks given by

( = ( n=1:N (An)k ( ( m=1:M(Cm)k

where each mediator bank corresponds to one particular combination of antecedent and consequent predicate banks. A clear implication of this equation is that there is a high cost, in terms of structural complexity, associated with having large numbers of predicates, each with multiple banks, tied together in the same rule ( O(KN+M), where N = number of antecedents, M = number of consequents, and K = banks per predicate). In Figure 3, with two antecedents (A and B) and two consequents (C and D), each with two banks, there are sixteen banks of the mediator. This constrains the average number of banks per predicate to be quite low, although it does not prevent some heavily utilized predicates from having a larger number of banks.

Two key features of this mediator structure enable it to propagate activity correctly.

First, there are differential delays on the input links to the mediator structure, such that

lower-index outputs will be selected before higher indices. For a simple rule like

falls(y,z) ( hurt(y), this means that if any bank of falls is active with appropriate role fillers, the first bank of the hurt predicate will subsequently become active before any other. In addition, there are inhibitory links within the multiple mediator structure, from lower output indices to higher, such that as soon as an output path is chosen (in this case the first bank of the hurt focal cluster) then alternate higher-index output paths are blocked. Differential delays in combination with unidirectional inhibition result in a sequential search for an available output bank. The maximum time required for this search is proportional to the structural complexity of the mediator (specifically O(KM) for forward propagation), since potential delay values increase linearly with the number of output banks that match a given input.

[pic]

Figure 3: Diagram of the multiple-instantiation switch structure for the rule with antecedent A(x) and B(x) and consequents C(x) and D(x). The colors/shades associated with the first and second roles of predicate A represent entities ``Light'' and ``Dark'', respectively.

In the more complex scenario shown in Figure 3 where there are multiple sources of activity,

the filtering properties of the rule mediator come into play. We refer to mediator banks here with a four-digit binary label that indicates which combination of antecedent and consequent banks are being linked. This example shows the first bank of A (designated A0) active with role Light while the second bank of A (A1) is active with role Dark. (Light and Dark here simply represent distinct phases, color/shading-coded in the diagram.) Similarly, we have B0 (Dark) and B1 (Light) such that incoming role links match only for mediator banks with indices 01** (for Light) and those with indices 10** (for Dark). If the inputs from A0 and B1 arrive at the mediator before those from A1 and B0, then they will claim the lowest-index mediator bank available to them (0100) and thus propagate their activity to C0 and D0. The other mediator banks that are compatible with A0 and B1 are inhibited by the activity of bank 0100. Now when activity from A1 and B0 arrives at the mediator structure, it also tries first to grab a bank corresponding to C0 and D0. However, the role mismatch detection circuit within this bank (1000) of the mediator detects the mismatch between Light and Dark and thus prevents it from propagating the activity. Since this bank fails to become active, the other 10** banks remain available. Only the last such mediator bank, 1011, will allow activity to propagate; ultimately, C1 and D1 become active with their role nodes firing in synchrony with Dark.

4. Discussion and Conclusion

The multiple instantiation switching mechanism described here is an improvement over the switch used previously in SHRUTI (Shastri and Ajjanagadde 1993) because it is significantly less complex (and thus less prone to error) and because its operation is largely a consequence of rule mediation functions that would be necessary even without multiple instantiation. Both the mechanism described here and the earlier SHRUTI switch support the same general solution to the multiple instantiation problem: limited duplication of long-term representations along with a mechanism for routing information to appropriate instances. Another potential solution to the multiple instantiation problem is to separate distinct representations by time rather than by space. In the INFERNET model, which like SHRUTI employs temporal synchrony variable binding, spiking nodes undergo an oscillation frequency doubling when they need to be involved in two representations rather than one (Sougne 2001). This can also be an effective connectionist solution. However, since period-doubling requires different relational instances to be represented in different temporal phases, the number of bindings per instance is limited to the total number of bindings possible divided by the number of instances. If the total number of dynamic bindings is limited – reasonable assumptions based on 40 Hz synchronization and the ability of neurons to separate inputs in time suggest that this number is less than eight (Shastri and Ajjanagadde 1993) – then reasoning with more than a very small number of relational instances becomes impossible using period-doubling because there simply would not be enough binding capacity to accommodate each relational instance. With shruti, on the other hand, the total number of active entities is limited, but each entity can participate in bindings with many different relational instances, allowing for significantly more complex inference to be performed.

There are a number of problems that need to be solved in order to utilize temporal synchrony variable binding for relational inference. The rule mediator structure implemented in the current version of the SHRUTI model solves many of these problems, providing an effective mechanism for multiple instantiation in the service of complex causal inference.

Acknowledgements

This work was supported by grants to L.S. from the National Science Foundation (SBR-9720398,ECS-9980970), and by subcontracts from Cognitive Technologies Inc., related to the Office of Naval Research contract N00014-95-C-0182 and the Army Research Institute contract DASW01-97-C-0038.

References

Mani, D. and L. Shastri (1993). "Reflexive reasoning with multiple-instantiation in a connectionist reasoning system with a type hierarchy." Connection Science 5: 205-242.

Marcus, G. (2001). Plasticity and nativism: towards a resolution of an apparent paradox. Emergent Neural Computational Architectures, Springer-Verlag: 368-382.

Shastri, L. (1999). "Advances in SHRUTI - a neurally motivated model of relational knowledge representation and rapid inference using temporal synchrony." Applied Intelligence 11.

Shastri, L. (2002). "Episodic memory formation and cortico-hippocampal interactions." Trends in Cognitive Science 6:162-168.

Shastri, L. (Submitted) From transient patterns to persistent structures. In revision, Behavioral and Brain Sciences. Available at icsi.berkeley.edu/~shastri/psfiles/shastri_em.pdf

Shastri, L. and V. Ajjanagadde (1993). "From simple associations to systematic reasoning." Behavioral and Brain Sciences 16: 417-494.

Sougne, J. (2001). "Binding and multiple instantiation in a distributed network of spiking nodes." Connection Science 13: 99-126.

Wendelken, C. (2003). SHRUTI-agent: a structured connectionist architecture for inference and decision-making. Computer Science. Berkeley, University of California at Berkeley.

Wendelken, C. and L. Shastri (2003). Acquisition of concepts and causal rules in SHRUTI. Proceedings of the Twenty-Fifth Annual Conference of the Cognitive Science Society.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download