An integration-based approach to dynamic formation of a knowledge-base:

The method and its applications

Adil KABBAJ, ER-REMLI Hamid and MOUSAID Khadda

INSEA, Rabat, Morocco, B.P. 6217

Fax : (212) 7 77 94 57

Abstract. This paper presents the dynamic formation process of a Knowledge Base (KB). A KB in our work corresponds to concept types hierarchy augmented by type definitions, individual declarations and/or descriptions, schemas and type synonyms.

These knowledge structures as well as the whole KB are represented by conceptual graphs.

The paper shows how the dynamic formation process integrates knowledge structures in a KB and how this later is modified and reorganized during the integration.

Examples are given to illustrate different aspects of the approach.

Category: Tools and Applications, Type: Research notes

KeyWords: Software Tools and Systems, Knowledge Representation, Knowledge Engineering and Modeling, Knowledge Bases, Dynamic formation of Knowledge Bases.

1. Introduction

In the past ten years, we have designed, implemented and used a CG tool called SYNERGY that is based on the use of a knowledge base (KB) composed of the concept types hierarchy with type definitions, instance descriptions, schemas and type synonyms [Kabbaj, 96, 99]. The SYNERGY tool is composed basically of a CG graphic editor, the CG activation-based programming language Synergy, an engine for dynamic formation of KB, an information retrieval process and a browser.

The KB in SYNERGY is currently used according to two scenarios :

ü Manual KB : The user can use the interface and the CG graphic editor of SYNERGY to construct and to edit (add, delete or modify) manually the KB. When the user activates the Synergy interpreter, this later will use the content of the knowledge base along with the primitive types hierarchy to interpret the requests of the user.

ü Dynamic KB : The user can use the interface and the CG graphic editor of SYNERGY along with the engine for dynamic formation of KB to construct dynamically an incremental KB. The dynamic formation process will integrate in the KB any new information given by the user (type specification or definition, instance description, schema or synonym declaration). Finally, the user can activate the information retrieval process to ask for a concept type (e.g., locate a concept type in the KB) or to ask for a description (e.g., locate an identical description in the KB and/or descriptions that are similar to the given one). From the located information, the user can browse the KB.

This paper presents the second scenario and especially the dynamic formation process, the first scenario is presented in [Kabbaj, 99]. The principle of the dynamic formation of a KB has been introduced briefly in [Kabbaj, 95] and it is refined and described more fully in [Kabbaj, 96]. Erramli and Mousaid [Erramli and Mousaid, 98] have developed some applications that use the dynamic formation process of SYNERGY.

A demonstration is planned for the ICCS’99 conference.

This paper is organized as follows : section 2 presents an example that provides an immediate idea of what kind of dynamic formation process we propose. Section 3 presents the kinds of information that the formation process is able to integrate in a KB and discusses some features related to the considered information and to the KB. Section 4 describes the dynamic formation process, section 5 gives another example and a conclusion with an outline of future works is given in section 6.

2. Example

This example gives an intuitive idea of the KB dynamic formation process provided by the SYNERGY CG tool. More details on the nature of the information to integrate and on the integration process are given in sections 3 and 4. Due to space, we paraphrase the interaction between the user and the system (the real interaction uses the SYNERGY menu and dialog windows).

Assume that the KB contains only the concept [Universal]. The user starts by specifying that the type Object is a specialization of Universal :

>> “Object” < “Universal”.

The dynamic formation process creates a concept in the KB for Object and relates it to [Universal] with the relation “sp”. A similar interaction holds for the types “Live” and “Intelligence”. Figure 1.a shows the result of this interaction.

Next, the user specifies the definition of the type Human and the formation process integrates it in the KB (Figure 1.b).

>> [Human: _H = [Object:super]- poss®[Live] poss®[Intelligence]] and the concepts [Object:super], [Live] and

[Intelligence] are in focus.

In his specification of the above definition, the user indicates the concepts that are relevant for this information; that should be in focus. The definition will be indexed (directly or not) under the types of those concepts (Figure 1.b). As shown in Figure 1.b, the definition is viewed differently by the indexes : Human is a specialization (sp) of Object (due to presence of the referent “super” in the concept [Object:super]; it represents the genus in this case). However, for the indexes Live and Intelligence, the definition of Human is just a situation (sta) where the types Live and Intelligence are used.

(a)

(b)

Figure 1: The integration process in action

Now, the user gives the definition of the type Vegetable :

>> [Vegetable: _V = [Object:super]- poss®[Live] lack®[Intelligence]] and the concepts [Object:super], [Live]

and [Intelligence] are in focus.

Since [Object] is in focus, the integration of the above definition starts by comparing the definition to the structures indexed under [Object] in the KB (Figure 1.b). Thus, the definition of Vegetable is compared to the definition of Human and the result is that they have an information in common : [Object]¾poss®[Live]. The dynamic formation process creates a concept for the definition of Vegetable and another concept for the above information and it places them as shown in Figure 2.a. A similar integration is done through the focus [Live] and [Intelligence]; but the comparison of the two definitions and the creation of the concepts aren’t done again (Figure 2.a).

Any generalization created by the dynamic formation process is considered as a schema. This is the case for the generalization ([Object]¾poss®[Live]) between the two definitions above (Figure 2.a). As shown in Figure 2.a, the generalization is viewed as a situation from the perspective of [Object] and [Live].

Now, the user gives the description of the following schema/situation :

>> [Artificial_Intelligence]¾goalOf®[Study]¾obj®[Intelligence] and the concept [Intelligence] is in focus.

The integration of the above schema starts by comparing it to the structures indexed under [Intelligence] (Figure 2.a). The integration process finds that the schema has nothing in common with the definitions of Human and Vegetable, except the concept [Intelligence] ! Thus, a new concept is created for the schema and it is placed under the concept [Intelligence] as shown in Figure 2.b.

Figure 2.b shows also the result of some update done after the integration of the following new definition:

>> [Living :_Lvg = [Object: super]¾poss®[Live: super]] and [Object: super] and [Live: super] are in focus.

The above definition specifies that the type Living is a specialization of two types : Object and Live (e.g., Living is an object with live and it is also a live incorporated in an object). As for the above examples, the new definition will be compared with the structures indexed under the first focus [Object] (Figure 2.a); the definition is compared with the schema ([Object]¾poss®[Live]). The result is that the two descriptions are equivalent, the schema is now re-interpreted or assimilated as a definition and the relations are updated (Figure 2.b). A similar treatment is done with the second focus [Live].

One important modification that isn’t shown in the figure is the contraction of the Living definition from the definitions of Human and Vegetable. Indeed, the definition of Human :

[Object: super]- poss®[Live] poss®[Intelligence] is replaced by [Living: super]¾poss®[Intelligence].

A similar update is done in the definition of Vegetable.

(a) (b)

Figure 2: The integration process in action (suite)

Assume now that the following definition of Animal is given to the system :

>> [Animal: _A = [Object:super]- poss®[Live] hasfew®[Intelligence]] and the concepts [Object:super], [Live]

and [Intelligence] are in focus.

The dynamic formation process notes that the above definition contains the definition of the type Living. To make explicit the presence of such a type in the description of the Animal definition, the formation process replaces the type of [Object: super] by Living : [Living: super]. After this kind of elicitation, the integration is done as before; starting with the comparison of the new information with the structures indexed under [Living] (recall that the focus [Object: super] becomes [Living: super]).

3. The nature of the information to integrate in the KB

The goal of the dynamic formation process is to integrate in the KB any new information given by the user (or by any other source). A new information can be a specification of a type (e.g., “Man” is a “Person”), a definition of a type, a specification of an individual (e.g., “Jack” is a “Man”), a description of an individual, a description of a schema, or a specification of a type synonym (e.g., “Car” is synonym to “Auto”). Figure 3 shows the general structure of a KB. Even if the KB continually changes, it keeps always the same structure.

[Type] : represents either the type alone or

the type and its definition.

[Individual] : represents either the type and

the individual identifier alone or

it specifies also its description.

[Schema] : is a context that contains the

description of the schema.

[Synonyms] : it contains the list of the

synonyms for a type Type.

[Type] can be specialized to others [Type].

[Type] can be illustrated be several [Schema].

[Type] can have several instances [Individual].

[Individual] can be illustrated be several [Schema].

[Schema] can be specialized to others [Schema].

[Schema] can be specialized by several [Type].

Figure 3: The structure of a KB

Let’s consider some important features related to the information considered and to the structure of the KB. As described in the next section and illustrated in sections 2 and 5, the dynamic formation process takes into account those features.

ü A type can be defined as a subtype of one or several types. So, type definition isn’t limited to the definition by genus (only one subtype) and differentia [Sowa, 84].

ü A schema represents a specific or a general situation, concerning an object, an action, an event, a plan or any other knowledge. A schema can be a situation for several types and/or individuals used in the description, unlike the definition of schema given in [Sowa, 84] where a schema concerns one type.

ü Type synonym: in general, the source of the information could be different and to be flexible enough, we allow for the use of different identifiers to denote in fact the same type. Hence, one user (or source) can describe a schema that contains the concept [Car] and another user (or even the same user) can describe another schema that contains the concept [Auto], while “Car” and “Auto” are synonyms. That is way in our definition of a KB, a list of synonyms can be associated to a type. As described later, beside the possibility to describe explicitly a synonym for a type, the dynamic formation process of SYNERGY is able to discover that a new type is in fact the synonym of another type.

ü A new information (e.g., a type definition or a schema) can contain definition of concept type(s). For instance, a new information can correspond to the following schema :

[Boy: Karl]¬agnt¾[Observe]¾obj®[Bird]¬agnt¾[Eater]¾obj®[Fish]

poss®[Beak]¾sizeOf®[Measure: High].

And assume that the KB contains already the following definition :

[Pelican :_P = [Bird: super]¬agnt¾[Eater]¾obj®[Fish]

poss®[Beak]¾sizeOf®[Measure: High] ].

In this case, the integration process should note that the above schema is equivalent to this one :

[Boy: Karl]¬agnt¾[Observe]¾obj®[Pelican].

Thus, the initial description of the situation concerns Pelican even if this type isn’t used explicitly in the description, perhaps because the source of the information doesn’t know the definition of the Pelican or it doesn’t know that the KB contains such a definition.

To take into account this possibility, the dynamic formation process performs a kind of elicitation before the integration in the KB of a new information.

ü Definition and schema. In our definition of a KB, a schema can be more general than another schema and/or a type definition. For instance, the following information can be integrated as a schema :

[Bird]¾poss®[Beak]¾sizeOf®[Measure: High].

If the definition of Pelican is given next, then the integration process should establish that such a definition is a specialization of the above schema.

Another point is that a schema can be re-interpreted as (or converted to) a definition. For instance, after the integration of some information that concern Person, the dynamic formation process can make the following generalization [Person]¬agnt¾[Give]¾obj®[Course]. As described later, any generalization made by the formation process is considered as a schema. Now, if the definition of “Teacher” ([Teacher: _T = [Person:super]¬agnt¾[Give]¾obj®[Course]]) is to be integrated, the formation process should identify the relation between such a definition and the above generalization and it should update the content of the KB.

4. The dynamic formation process

The treatment done by the dynamic formation process depends on the kind of information to integrate :

ü Integration of a synonym declaration; “Type1 is_synonym_of Type2” : The synonym Type1 is added to the list of the synonyms of Type2 (if it isn’t already there). The list of synonyms for Type2 is related to the concept type declaration [Type2] with the relation “syn” (Figure 3).

ü Integration of an individual declaration; “Inst1 is_an_individual_of Type” : The concept [Type: Inst1] is created and it is related to the concept type declaration [Type] with the relation “inst” (Figure 3).

ü Integration of an individual description; “Inst1 is_an_individual_of Type with description CG” : The concept [Type :Inst1 = CG] is created and it is related to [Type] with the relation “inst” (Figure 3).

ü Integration of a type declaration; “Type1 is_subtype_of Type2, …, TypeN” : The concept [Type1] is created and it is related to [Type2], …, [TypeN] with relations “sp” (Figure 3).

ü Integration of a type definition or a schema. This is the main case in our approach to dynamic formation of a KB.

Integration of a type definition or a schema

The integration process in this case is composed of three steps (examples of sections 2 and 5 illustrate them) :

1) Elicitation of the definition or the schema. The goal of this operation is to provide the more precise formulation of the structure (definition or schema), according to the current contain of the KB.

Elicitation here is an iterative operation that checks at each iteration if the structure to integrate contains the definition of a type T that exists already in the KB. If so, the integration process replaces the corresponding super-type by T and the parts of the structure that correspond to the definition of T are colored and the integration process asks the user if those parts should be eliminated from the structure. The process asks this question because the user can judge that some parts (or all) of the definition should remain (perhaps to detect more similarity with structures in the KB).

The elicitation operation is iterative in order to enable the identification of definitions that require the identifications of other definitions. For instance, in the situation “scientists, that studies birds with big beak and that eat fish, work hard”, the elicitation operation is able to note that the situation contains the definition of Pelican. Thus, a more precise formulation of the situation is : “scientists, that studies Pelican work hard”. Now the elicitation is able to note that the new formulation situation contains itself a definition; definition of PelicanScientist (“Scientist that studies Pelican”). Thus, the elicitation operation precise the situation further : “ PelicanScientists work hard”.

A special case can occur if the structure to integrate corresponds in fact to the body of a definition that exists already in the KB. If the structure is also a definition then the integration process will inform the user that the new type is a synonym of another type. In this case, the type synonym isn’t given explicitly by the user, it is deduced by the integration process. For instance, if the user gives this new definition to integrate : “Associative network is a graph of concepts used to represent conceptual knowledge” and in the KB we have this definition : “Semantic net is a graph of concepts used to represent conceptual knowledge”. Elicitation operation is able to note that the body of the two definitions are equivalent (the definition of Semantic net is located in the KB because it is also a specialization of the concept graph) and the dynamic formation process concludes that Associative network is a synonym for Semantic net.

2) Specification of the focus : The user should specify the concepts that are relevant for his structure; that should be in focus during the integration. The dynamic formation process uses these concepts_in_focus to initiate the integration of the structure in the KB.

3) Integration of the structure in the KB through the concepts in focus: for each concept judged as a focus, the integration process attempts to locate in the KB its corresponding type declaration and its individual declaration (if the concept is of the form [Type: Individual_Identifier]) and then, the structure is propagated to all the children of the located concept C. If the structure S is propagated for the first time to a child N then the first step is to compare the structure S with the contain of the child N which could be a definition or a schema. The next steps depend on the result of the comparison :

§ If S is more general than N then S is placed between N and its father C as indicated in Figure 5.a. In this case, if S is a definition of a type then this definition is contracted from N and eventually from some of its descendants (if N is a schema). For instance, assume that the KB contains the concept types Pelican and Scientist and the user gives this schema to integrate :

>> [Scientist]-

¬agnt¾[Study]¾obj®[Pelican],

¬agnt¾[Work]¾manr®[Hard] with the concepts [Scientist] and [Pelican] in focus.

This schema is integrated as shown in Figure 4.a. Assume now that the user gives this definition (note that PelicanScientist is defined as a specialization of two types : Scientist and Pelican) :

>> [PelicanScientist : _P = [Scientist: super]¬agnt¾[Study]¾obj®[Pelican: super] ]

with the concepts [Scientist: super] and [Pelican: super] in focus.

Figure 4: S > N and S is a definition

The integration of the above definition through the focus [Scientist] will lead to the comparison of the definition to the schema indexed under [Scientist]. Since the definition is more general than the schema, it is placed as shown in Figure 4.b and it is contracted from the schema which becomes: [PelicanScientist]¬agnt¾[Work]¾manr®[Hard].

The integration of the PelicanScientist definition through the focus [Pelican] will not repeat the comparison and the integration described above, it will adjust only the relations as shown in Figure 4.c.

§ If S is more specific than N then S is propagated to all the children of N (Figure 5.b).

§ If S is equal to N then the two concepts are “joined” : S and N will be represented by the same concept and this later will be connected to the relations that were connected to S and N. Some special cases are considered. For instance, if N was a schema and S is a definition then the schema is re-interpreted as a definition and this later is contracted from the descendants of N. The example of section 2 illustrates this situation, when the definition of Living was joined with the schema ([Object]¾poss®[Live]) and then contracted from the definitions of Human and Vegetable.

§ If S and N have an information (a subgraph) in common, other than the common father C, then a concept of type Schema is created to contain the information and it is placed as indicated in Figure 5.c.

§ If S and N have only the father C in common, then nothing is done.

As specified above, the propagation of the structure S to a child N is done only once; S could arrive to N through different paths but one propagation of S to N suffices.

(a) (b) (c)

Figure 5: The result of the comparison

The detail of the integration process (including the creation, elimination and modification of the relations considered during a propagation) is given in ([Kabbaj, 96], [Er-remli and Mousaid, 98]).

5. Example

This example is an excerpt from an application about Painters and Scenes [Er-remli and Mousaid, 98], it illustrates the integration of individuals and the association of schemas to individuals.

The concept type Painter is a specialization of Artist and it is a situation (a schema) for WorkOfArt (Figure 6.a).The user gives then this individual declaration (Figure 6.a. shows the result of its integration):

>> [Painter : Renoir = <Description of the individual Renoir> ]

The user gives now a situation that concerns Renoir :

>> [Disorder]-

agnt®[Painter: Renoir.]

dateOf®[Year = 1881]

obj®[Crisis]-

car®[Deep]

obj®[Artistic] with [Painter: Renoir.] in focus.

The above situation is integrated as a schema of [Painter : Renoir] (Figure 6.b).

Figure 6: Integration of Individuals and schemas

The user gives another situation about Renoir :

>> [Painter: Renoir.]-

¬agnt¾[Paint]¾loc®[Nature],

¬agnt¾[Use]¾obj®[Color]¾char®[Different],,

¬agnt¾[Represent]-

obj®[Nature]

obj®[Woman]

obj®[Child] with [Painter: Renoir.], [Represent] and [Color] in focus.

The dynamic formation process recognizes that the types Represent and Color are unknown and the user is requested to integrate them (by specifying their super-types) before resuming the integration of the above situation. Figure 6.c shows the result of the integration of the types Represent and Color and of the situation (this later has been compared with the first situation but they have nothing in common).

The user gives next the description of the individual “Vangogh” which is integrated as an instance of [Painter] and then he gives the following situation about Vangogh :

>> [Painter: Vangogh.]-

¬agnt¾[Use]¾obj®[Color]¾char®[Impressive],,

¬agnt¾[Represent]-

obj®[Situation]¾char®[Religious]

obj®[Sunflower]

with [Painter: Vangogh.], [Represent]

and [Color] in focus.

As shown in Figure 7, the dynamic formation process recognizes that this situation shares with a similar situation concerning Renoir this information :

[Painter]-

¬agnt¾[Use]¾obj®[Color]

¬agnt¾[Represent]

Hence, the generalization is done leading to the modification depicted in Figure 7.

Figure 7: After the integration of Vangogh and of a related situation

6. Conclusion and future works

We have presented and illustrated with several examples the dynamic formation process of the SYNERGY CG tool [Kabbaj, 96, 99]. This process performs an incremental construction of a knowledge base (KB), as the user gives knowledge structures to integrate in the KB (type declaration and/or definition, individual declaration and/or description, schema and type synonym).

The dynamic formation process as well as the SYNERGY CG tool have been implemented with Microsoft Visual C++. A Java re-implementation is underway.

We plan to use the dynamic formation process in different domains : information retrieval, advanced data bases, case-based reasoning, learning systems, natural language processing and multi-agents systems (especially if we consider the memory of an agent as a dynamic KB).

References

1. Er-Remli H. and Mousaid K., Formation dynamique et incrémentale d’une base de connaissances, Mémoire de troisième cycle, INSEA, Juillet 98, Rabat, Maroc, 1998.

2. Kabbaj A., Self-Organizing Knowledge Bases: The Integration Based Approach, in Proc. Of the Intern. KRUSE Symposium : Knowledge Retrieval, Use, and Storage for Efficiency, Santa Cruz, CA, USA, 1995.

3. Kabbaj A., Un système multi-paradigme pour la manipulation des connaissances utilisant la théorie des graphes conceptuels, Ph.D Thesis, DIRO, Université de Montréal, June, 1996.

4. Kabbaj A., A conceptual graph activation-based language : Synergy and its environment, submitted to Seventh International Conference on Conceptual Structures, ICCS’99.

5. Sowa J. F., Conceptual Structures : Information Processing in Mind and Machine, Addison-Wesley, 1984.