Interface SMAggregate
- All Superinterfaces:
SimilarityMeasure
- All Known Subinterfaces:
SMAggregateAverage
,SMAggregateEuclidian
,SMAggregateKMaximum
,SMAggregateKMinimum
,SMAggregateMaximum
,SMAggregateMinimum
,SMAggregateMinkowski
,SMAggregateWeighted
- All Known Implementing Classes:
SMAggregateAverageImpl
,SMAggregateEuclidianImpl
,SMAggregateImpl
,SMAggregateKMaximumImpl
,SMAggregateKMinimumImpl
,SMAggregateMaximumImpl
,SMAggregateMinimumImpl
,SMAggregateMinkowskiImpl
,SMAggregateWeightedImpl
AggregateClass
es.
Global similarity measures are defined by applying an aggregation function Φ to the local similarity values. The simple similarity measures for numeric attributes can be generalized easily to aggregation functions. Such aggregation functions are defined by determining
- a basic aggregation function and
- a weight model that determines weights ω = (ω1,\ldots, ωi) such that 0 ≤ ωi ≤ 1 and ∑1n ωi = 1
The default weight is 1.0 for all attributes. To ensure that ∑1n ωi = 1 all weights will be normalized automatically during runtime.
The aggregate measures can be defined in the xml file sim.xml. Therefore, it's necessary that an aggregate class was created in the xml file model.xml, which is referenced in the definition of the measure. It also needs an arbitrary name. In the inner tag, weights for the single attributes can be defined. The aggregate classes Average, Euclidian and Minkowski need weights anyway, otherwise the similarity will always be 1.0. The other classes will have the same weight for each attribute, if no weights are defined.
For example, an aggregate measure can look like:
<AggregateMinimum name="AggregateMinimumDataflowWeighted" class="DataflowElement" default="false"> <AggWeight att="name" weight="0.5"/> </AggregateMinimum>
- Author:
- Rainer Maximini
-
Field Summary
Modifier and TypeFieldDescriptionstatic final boolean
The default for ignoring null attribute values as void is true.static final String
static final String
static final String
The query case can contain user weights $w_u$ that are stored in the properties, accessable with this key.Fields inherited from interface de.uni_trier.wi2.procake.similarity.SimilarityMeasure
LOG_ORDER_NAME_NOT_FOUND
-
Method Summary
Modifier and TypeMethodDescriptiongetSimilaritiesToUse
(String attributeName) getSimilarityToUse
(String attributeName) boolean
void
setIgnoreNullAttributesInQuery
(boolean ignoreNullAttributesInQuery) void
setSimilarityToUse
(String attName, String similarityMeasure) In general, the element objects of the collection are compared with their default similarity measure.Methods inherited from interface de.uni_trier.wi2.procake.similarity.SimilarityMeasure
compute, getDataClass, getName, getSystemName, isForceOverride, isReusable, setForceOverride
-
Field Details
-
DEFAULT_IGNORE_NULL_ATTRIBUTES_IN_QUERY
static final boolean DEFAULT_IGNORE_NULL_ATTRIBUTES_IN_QUERYThe default for ignoring null attribute values as void is true.- See Also:
-
LOG_ATTRIBUTE_NAME_NOT_FOUND
- See Also:
-
LOG_ATTRIBUTE_NOT_FOUND
- See Also:
-
PROPERTY_USER_WEIGHT
The query case can contain user weights $w_u$ that are stored in the properties, accessable with this key. The weight $w$ for an attrbibute is the mulitplikation of $w_u$ and $w_c$, the weight defined for the class.- See Also:
-
-
Method Details
-
isIgnoreNullAttributesInQuery
boolean isIgnoreNullAttributesInQuery()- Returns:
-
setIgnoreNullAttributesInQuery
void setIgnoreNullAttributesInQuery(boolean ignoreNullAttributesInQuery) - Parameters:
ignoreNullAttributesInQuery
-
-
getSimilaritiesToUse
HashMap getSimilaritiesToUse()- Returns:
- The defined names of the
SimilarityMeasure
s that should be used for the elements. - See Also:
-
getSimilarityToUse
- Parameters:
attributeName
- The name of the attribute for which the specific similarity should be returned.- Returns:
- The name of the similarity measure, that should be used for this attribute. Can be null.
-
setSimilarityToUse
In general, the element objects of the collection are compared with their default similarity measure. But in some situations it can be necessary to use another similarity measure for the elements of a collection. Therefore, it exists the possibility to specify a similarity measure name that should be used instead. For eachDataObject
a similarity measure with that name should exist. Otherwise, the comparision of objects are ignored.Summarizing:
- If the
newValue
isnull
the default measures of the objects are used. This is the default behaviour. - If the
newValue
is the name of a similarity measure, for each data class whose objects can be occured in the collection a similarity measure with this name must exist. Attention, this also include the common super classes of the objects.
- Parameters:
attName
- The name of the element object.similarityMeasure
- The name of the similarity measure that should be used for the elements.
- If the
-
getSimilaritiesToUse
- Parameters:
attributeName
- The name of the element object.- Returns:
- The defined name of the
SimilarityMeasure
that should be used for the element.
-