METRICS, SIMILARITY, AND PROXIMITY
Approaches
1. Traditional metrics (distances, by Freshe's axioms)
2. "Minimal cost" transformation:
one object into another object
(edit disctance)
3. Common part of the initial
(compared) objects/structures
(e.g., subgraph, substructure, subsequence)
4. Vectro-like proximity
Applied Topics
Classification Theory/Clustering
Data Envelopment Analysis (DEA)
Comparison of Complex Technical Systems
Comparison of genoms, classification hierarchies (Mathematical Biology)
Comparison of chemical structures
Matching of information fragments (Information Engineering)
Image processing
Comparison of architectural decisions (Architecture)
Comparison of companies (economical/financial analysis)
Comparison of personnel (e.g., in personnel management)
Aggregation of modular solutions
Journals
J. of Classification
J. of Algorithms
SIAM J. on Computing
Theoretical Computer Science
Eur. J. of Operational Research
Information Processing Letters
Pattern Recongnition
Pattern Recongnition Letters
IEEE Trans. SMC, Part A
IEEE Trans. PAMI
Algorithmica
J. of the ACM
Discrete Applied Mathematics
Foundations of Computing and Decision Science
Bibliography
Books
1. J.-P. Barthelemy, and A. Guenoche, Trees and Proximity Representation, Chichester: J.Wiley & Sons, 1991.
2. D. Gusfield, Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge Univ. Press, New York, 1997.
3. M.Sh. Levin, Combinatorial Enineering of Decomposable Systems,
Dordrecht: Kluwer, 1998 (Chapter 4)
4. B.G. Mirkin, Mathematical Classification and Clustering, Boston: Kluwer, 1996.
5. P.A. Pevzner, Computational Molecular Biology: An Algorithmic Approach. Bradford Books, 2000.
6. G. Valiente,
Algorithms on Trees and Graphs.
Springer, Berlin, 2002.
Surveys
1. L. Bergroth, H. Hakonen, T. Raita,
A survey of longest common subsequence algorithms.
In:
Proc. of the Seventh Int. Symp. on String Processing and
Information Retrieval SPIRE'00,
39-48, 2000.
2. P. Bille,
A survey on edit distance and related problems.
Theoretical Computer Science,
337(1-3), 217-239, 2005.
3. X. Jiang, H. Bunke, J. Csirik,
Median strings: A review.
In: M. Last, A.A. Kandel, H. Bunke, (Eds.),
Data Mining in Time Series Databases.
World Scientific, 173-192, 2004.
Basic Papers
1. T. Akatsu, M.M. Halldorsson,
On the approximation of largest common subtrees
and largest common point sets.
Theoretical Computer Science, 233(1-2), 33-50, 2000.
2. A. Amir, D. Keselman,
Maximum agreement subtree in a set of evolutionary
trees: metrics and efficient algorithms.
SIAM J. Comput., 26(6), 1656-1669, 1997.
3. M. Arnold, E. Ohlebusch, Linear time algorithms for generalizations
of the longest common substring problem.
Algorithmica, 60(4), 806-818, 2011.
4. K.P. Bogart, Preference structures I:
Distance between Transitive Preference Relations.
J. Math. Soc., 3, 49-67, 1973.
5. H. Bunke, On relation between graph edit disctance and maximum
common subgraph.
Pattern Recong. Lett., 18(8), 689-694, 1997.
6. H. Bunke, K. Shearer, A graph disctance metric based on
the maximal common subgraph. Patern Recogn. Lett.,
19(3-4), 255-259, 1998.
7. W. Chen,
New algorithm for ordered tree-to-tree correction problem.
J. of Algorithms, 40(2), 135-158, 2001.
8. L.P. Chew, K. Kedem, D.P. Hutenlocher, J. Kleinberg,
Fast detection of geometric substructure in protein.
J. of Comput. Biology, 6(3-4), 313-325, 1999.
9. R. Connor, F. Simeoni, M. Iakovos, R. Moss,
A bounded distance metric for comparing tree structure.
Information Systems,
36(4), 748-764, 2011.
10. W.D. Cook, L.M. Seiford,
Priority ranking and consensus formation.
Management Science,
24(16) 1721-1732, 1978.
11. W.D. Cook, L.M. Seiford, M.Kress,
A General Framework for Distance-Based Consensus in
Ordinal Ranking Models.
European Journal of Operational Research,
96(2) 392-397, 1996.
12. L. Eghe, C. Michel,
String similarity measures for ordered sets
of documents in information retrieval.
Information Processing and Management,
38(6), 823-848, 2003.
13. M. Farach, T. Przytycka, M. Thorup,
On the agreement of many trees.
Inf. Process. Lett., 55(6), 297-301, 1995.
14. M. Ferrer, E. Valveny, F. Serratosa,
K. Riesen, H. Bunke,
Generalized median graph computation for means of graph embedding
in vector space.
Pattern Recognition, 43(4), 1642-1655, 2010.
15. C.R. Finden, A.D. Gordon,
Obtaining common pruned trees.
J. Classif., 2(1) 255-276, 1985.
16. A.D. Gordon, Constructing dissimilarity measures.
J. of Classification, 3(2), 335-348, 1986.
17. J.C. Gower, Measures of similarity, dissimilarity, and distance.
In: Encyclopedia of Stat. Sci., 5, S. Kotz, N.L. Johnson (Eds.),
Wiley, New York, 397-405, 1985.
18. J.C. Gower,
Measures of Similarity, Dissimilarity, and Distance.
In: S. Kotz, and N.L. Johnson, (eds.),
Encyclopedia of Statistical Sciences,
Vol. 5, New York: J. Wiley & Sons,
pp. 397-405, 1985.
19. S. Hannenhalli, P. Pevzner,
Transforming men into mice (polynomial algorithm for genomic distance problem).
36th Ann. Symp. on Foundations of Computer Science FOCS
1995,
Milwaukee, Wisconsin, IEEE Computer Society, 581-592, 1995.
20. E. Herrera-Viedma, F. Herrera, F. Chiclana,
A consensus model for multiperson decision making with different
preference structures.
IEEE Trans. SMC, Part A, 32(3), 394-402, 2002.
21. L. Hubert, and P. Arabie, Comparing Partitions.
J. of Classification, Vol. 2, No. 3, pp. 193-218, 1985.
22. X. Jiang, A. Munger, H. Bunke,
On median graphs: properties, algorithms, and applications.
IEEE Trans. PAMI, 23(10), 1144-1151, 2001.
23. T. Kohonen,
Median strings.
Pattern Recognition Letters,
3(5), 309-313, 1985.
24. V.I. Levenshtein,
Binary codes capable of correcting deletions, insertions and reversals.
Cybernetics and Control Theory, 10(8), 707-710, 1966
25. M.Sh. Levin,
Towards comparison of decomposable systems.
In: Data Science,
Classification, and Related Methods,
Springer-Verlag, Tokyo, pp. 154-161, 1998.
26. D. Maier,
The complexity of some problems on subsequences and
supersequences.
J. of the ACM, 25(2), 322-336, 1978.
27. B.G. Mirkin, and L.B. Chernyi,
On a Distance Measure Between Partitions of a Finite Sets,
Automation and Remote Control,
31/5, 786-792, 1970.
28. A.M. Rappoport,
Measurement of Distance Between Weighted Graphs of Expert Judgment.
In: Multicriteria Choice for Solving of Ill-structured Problems.
Inst. for System Analysis, Moscow, No. 5, pp. 97-108, 1978
(in Russian)
29. S.M. Selkow,
The tree-to-tree editing problem.
Information Processing Letters,
6(6), 184-186, 1977.
30. K.-C. Tai,
The tree-to-tree correction problem.
J. of the ACM, 26(3), 422-433, 1979.
31. V.G. Timkovsky,
Complexity of common subsequence and supersequence problems.
Cybernetics and Systems Analysis, 25(5), 565-580, 1989.
32. A. Tversky, Features of Similarity,
Psychological Review, 84(4) 327-352, 1977.
33. R.A. Wagner, M.J. Fisher,
The string-to-string correction problem.
J. of the ACM, 21(1), 168-173, 1974.
34. W.D. Wallis, P. Shoubridge, M. Kraetz, D. Ray,
Graph distances using graph union.
Pattern Recognition Letters,
22(6-7), 701-704, 2001.
35. K. Zhang,
A constrained edit-distance between unordered labeled trees.
Algorithmica, 15(3), 205-222, 1996.
36. K. Zhang, J.T.L. Wang, D. Shasha,
On the editing distance between undirected acyclic graphs.
Int. J. Foundations of Computer Science,
7(1), 43-57, 1996.
Papers
1. M.S. Bansal, D. Fernandez-Baca,
Computing distances between partial rankings.
Information Processing Letters,
109(4), 238-241, 2009.
2. M.P. Beck, B.W. Lin,
Some heuristics for the consensus ranking problem.
Computers and Operations Research,
10(1), 183, 1-7, 1983.
3. V. Berry, F. Nicolas,
Maximum agreement and compatible supertrees.
In: Proc. of the 15th Annual Symp. on
Combinatorial Pattern Matching CPM-2004,
LNCS 3109, Springer, Berlin, pp. 205-219, 2004.
4. C. Elzinga, H. Wang, Z. Lin, Y. Kumar,
Concordance and consensus.
Information Sciences, 181(12), 2529-2549, 2011.
5. S. Guillernot, F. Nocolas,
Solving the maximum agreement subtree and the maximum compatible
tree problems on many bounded degree trees.
In: M. Lewenstein, G. Valiente (Eds.),
Proc. of the Int. Conf. on Combinatorial Pattern Matching CPM 2006,
LNCS 4009, Springer, 165-176, 2006.
6. M.T. Hallett, C. McCartin,
A faster FPT algorithm for the maximum agreement forest problem.
Theory of Computing Systems}, 41(3), 539-550, 2007.
7. T. Jiang, M. Li,
On the approximation of shortest common supersequences and
longest common subsequences.
SIAM J. on Computing, 24(5), 1122-1139, 1995.
8. T. Jiang, V.G. Timkovsky,
Shortest consistent superstring computable in polynomial time.
Theor. Comput. Sci., 143(1), 113-122, 1995.
9. T. Jiang, L. Wang, K. Zhang,
Alignment of trees - an alternative to tree edit.
Theoretical Computer Science, 143(1), 137-148, 1995.
10. M.Sh. Levin,
Vector-like proximity estimates for structures.
In:
All-Russian Conf. "Problems and Methods of Decision Making
in Organizational Management Systems",
Moscow, Inst. for System Analysis, 116-117, 1988 (in Russian).
11. M.Sh. Levin,
Common part of preference relations,
Foundations of Computing and Decision Sciences,
28(4), 223-246, 2003.
12. A. Marzal, E. Vidal,
Computation of normalized edit distance and applications.
IEEE Trans. PAMI, 15(9), 926-932, 1993.
13. A. Robles-Kelly, E.R. Hancock,
string edit distance, random walks and graph matching.
Int. J. of Pattern Recognition and Artificial
Intelligence, 18(3), 315-327, 2004.
14. E.M. Rodrigues, M.F. Sagot, Y. Wakabayashi,
The maximum agreement forest problem:
Approximation algorithms and computational experiments.
Theoretical Computer Science,
374(1-3), 91-110, 2007.
15. D. Sasha, K. Zhang,
Simple fast algorithms for
the editing distance between trees and related problems.
SIAM J. on Computing,
18(6), 1245-1262, 1989.
16. C. Solnon, J.-M. Jolion,
Generalized vs set median string for histogram based distances:
Algorithms and classification results in image domain.
In: F. Escolano, M. Vento (Eds.),
Proc. of the 6th Workshop on Graph Based Representation
in Pattern Recognition GbPrP 2007, LNCS 4538, Springer,
404-414, 2007.
17. E. Tanaka, K. Tanaka,
The tree-to-tree editing problem.
Int. J. Pattern Recogn. and Art. Intell.,
2(2), 221-240, 1988.
18. J. Tarhio, E. Ukkonen,
A greedy approximation algorithm for constructing shortest common
superstrings.
Theor. Comput. Sci.,
57(1), 131-145, 1988.
19. Gabriel Valiente
An efficient bottom-up distance between trees.
In: Proc. of the 8th Int. Symp. String Processing Information Retrieval
SPIRE 2001, IEEE Computer Science Press,
Laguna de San Rafael, Chile 212-219, 2001.
20. W.D. Wallis, P. Shoubridge, M. Kraetz,
D. Ray,
Graph distances using graph union.
Pattern Recognition Letters,
22(6-7), 701-704, 2001.
21. J.T.-L. Wang, K. Zhang,
Finding similar consensus between trees: An algorithm and
distance hierarchy.
Pattern Recognition, 34(1), 127-137, 2001.
22. C. Whidden, R.G. Beiko, N. Zeh,
Fast FPT algorithms for computing rooted agreement forest:
Theory and experiments.
In:
Proc. of the 9th Int. Symp. on Experimental Algorithms
SEA 2010,
LNCS 6049, Springer, 141-153, 2010.
Electronic Preprints
1. T. Kelsey, L. Kotthoff,
The exact closest string problem as a constraint satisfaction
problem. Electronic preprint,
CoRR. abs./1005.0089, 2010.
2. M.Sh. Levin,
Aggregation of composite solutions:
strategies, models, examples.
Electronic preprint. 72 pp., Nov. 29, 2011.
http://arxiv.org/abs/1111.6983 [cs.SE]
3. M.Sh. Levin,
Multiset estimates and combinatorial synthesis.
Electronic preprint. 30 pp., May 9, 2012.
http://arxiv.org/abs/1205.2046 [cs.SY]
4. C. Whidden, R.G. Beiko, N. Zeh,
Fixed parameter and approximation
algorithms for maximum agreement forests.
Electronic preprint, 12 Aug. 2011,
arXiv: 1108.2664v1 [q-bio.PE]
PhD Theses
1. Bruno T. Messmer,
Efficient Graph Matching Algorithms for Preprocessed
Graphs.
PhD Thesis, Inst. of Comp. Science and Applied Mathematics,
University of Bern, 1995.