Interface SMTaxonomy

  • All Superinterfaces:
    All Known Subinterfaces:
    SMTaxonomyClassic, SMTaxonomyClassicUserWeights, SMTaxonomyNodeHeight, SMTaxonomyPath
    All Known Implementing Classes:
    SMTaxonomyClassicImpl, SMTaxonomyClassicUserWeightsImpl, SMTaxonomyImpl, SMTaxonomyNodeHeightImpl, SMTaxonomyPathImpl, SMTaxonomyWeightedNodes

    public interface SMTaxonomy
    extends SimilarityMeasure
    Abstract interface that collects all similarity measures for AtomicClasses with taxonomical orders, see InstanceTaxonomyOrderPredicate.

    A special variant of symbolic types are taxonomies. A taxonomy is an n-ary tree in which the nodes represent symbolic values. The symbols at any node of the tree can be used as attribute values. Unlike a plain symbol type, which only lists possible attribute values, a taxonomy represents an additional relationship between the symbols through their position within the taxonomy-tree. This relationship expresses knowledge about the similarity of the symbols in the taxonomy

    Using Taxonomies

    We now describe four different example scenarios.

    • 1a) Consider a sales support application for Personal Computers. A case represents an available PC from the stock. The case representation contains an attribute graphics card, and the taxonomy from above represents the set of possible values. Consider a case $c_1$ with the ELSA 2000 card and a case $c_2$ with Matrox Mystique card. If we assume that a customer requires a Miro Video graphics card, then $c_1$ is certainly more useful than $c_2$, because Miro Video and Elsa 2000 have more in common (e.g. the S3 chip) than the Miro Video and the Matrox Mystique. In general, we could use a similarity measure that assesses similarity based on the distance between case and the query value in the taxonomy tree.
    • 1b) Imagine, the customer states in the query a request for a S3 Graphics Card. Then, any of the graphics cards in the S3 sub-tree are perfectly suited. Hence, we expect the local similarity value between this query and case $c_1$ (from example 1a) to be $1$. From this consideration we can conclude that whenever the query value is located above the case value, the similarity measure should be $1$.
    • 2a) Consider a diagnosis experience management system for PCs in which cases encode diagnostic situations and faults that have occurred previously. The domain expert describes a fault that can occur with any S3 card. Therefore, the respective case contains the attribute value S3 Graphics Card. Assume now, a PC user has a problem and s/he states that there is an Elsa 2000 card in the PC, than the local similarity for the graphics card attribute should be equal to $1$ because the case matches exactly w.r.t. this attribute. From this consideration we can conclude that whenever the case value is located above the query value the similarity measure should be $1$.
    • 2b) For the same diagnostics example, imagine now a different query in which the user does not exactly know what kind of graphics card is installed in the PC, but s/he knows that it is a S3 Trio card. S/he therefore enters S3 Trio as attribute value in the query. Again, the case about S3 cards mentioned in example 2a matches exactly because, whatever graphics card the user has, we known it is an S3 card and the situation described in the case applies. However, if we consider a different case that describes a problem with the Miro Video card, then this case does not match exactly. Since we don't know what graphics card the user has (it can be a Miro Video but it can also be a VGA V64) we expect a local similarity value less than $1$. Therefore we cannot conclude that whenever the query value is located above the case value the similarity measure should be 1.

    Although we have used the same taxonomy in all four examples, it is obvious that they have to be treated differently for the similarity computation. In the query and cases from example 1a, only leaf nodes from the taxonomy are used. The examples 1b to 2b make use of inner nodes of the taxonomy, but in each example the semantics of the inner nodes is different which leads to different similarity measures.

    Rainer Maximini