Class LongestCommonSubsequenceDistance
- All Implemented Interfaces:
SequenceDistanceMeasurer
,SequenceDistanceMeasurerDouble
This class supports computing distance for Java String objects or arrays of any of the primitive types, or arrays of objects. It makes no assumptions about the contents of the Strings or arrays, and they can contain duplicates, or can be such that some elements only appear in one or the other of the sequences, or can be of different lengths.
Runtime: O(n*m), where n and m are the lengths of the two sequences (i.e., Strings or arrays).
If you need to compute this distance for permutations (i.e., same length, same set of
elements, unique elements), then it is recommended to instead use the ReinsertionDistance
class. That class exploits the common
length, common set of elements, and unique elements properties of permutations to more
efficiently (in O(n lg n) time) compute the longest common subpermutation (i.e., that class does
not delegate the work to the edit distance algorithm). However, the result of ReinsertionDistance
is half of what LongestCommonSubsequenceDistance would compute. This is because for permutations
the elements that would be inserted are exactly the same as those that would be deleted by the
edit operations, and ReinsertionDistance is defined as an edit distance with one edit operation,
removal/reinsertion (i.e., a deletion is only half the operation, and the insertion is the other
half of the operation).
Wagner and Fischer's String Edit Distance was introduced in:
R. A. Wagner and M. J. Fischer, "The string-to-string correction problem," Journal of the ACM,
vol. 21, no. 1, pp. 168–173, January 1974.
-
Constructor Details
-
LongestCommonSubsequenceDistance
public LongestCommonSubsequenceDistance()Constructs a longest common subsequence distance.
-