Interface SMStringTermCount
- All Superinterfaces:
SimilarityMeasure
,SMString
- All Known Implementing Classes:
SMStringTermCountImpl
Compares two strings using the Term Count algorithm. The comparison depends on the used
delimiter.
The Term Count is a measure of the similarity between two strings, which we will refer to as query and case. It counts the number of terms in query and case. These two numbers are compared, not the terms themself. For example,
- If the delimiter is "-", the query is "String-Term-Count" and the case is "An-other-example", the Term Count is identical. So the similarity is 1.
- If the delimiter is "-", the query is "String-Term-Count" and the case is "Another-example", there is a gap of one term. So the similarity is 1 minus the gap divided by the length of the largest array. In this case the similarity is 0.667.
Similarity
The similarity between query and case is defined as sim(q,c) = 1 - abs(length(query) - length(case)) / max(length(query), length(case))
- Author:
- Rainer Maximini
-
Field Summary
Modifier and TypeFieldDescriptionstatic final String
The default delimiter value is " ".static final String
Name of similarity measure is "StringTermCount".Fields inherited from interface de.uni_trier.wi2.procake.similarity.SimilarityMeasure
LOG_ORDER_NAME_NOT_FOUND
-
Method Summary
Methods inherited from interface de.uni_trier.wi2.procake.similarity.SimilarityMeasure
compute, getDataClass, getName, getSystemName, isForceOverride, isReusable, setForceOverride
-
Field Details
-
NAME
Name of similarity measure is "StringTermCount".- See Also:
-
DEFAULT_DELIMITER
The default delimiter value is " ".- See Also:
-
-
Method Details
-
getDelimiter
String getDelimiter() -
setDelimiter
-