JavaPermutationTools
Overview
JavaPermutationTools (JPT) is a Java library for representing and generating permutations and sequences, as well as performing computation on permutations and sequences. JPT's Permutation class is an efficient object-oriented implementation of a permutation, and includes a variety of methods for manipulating the permutation (e.g., swapping elements, reversing, scrambling, among many others), generating random permutations, and it also supports iterating over permutations. The JPT library also includes implementations of a variety of permutation distance metrics as well as distance metrics on sequences (i.e., Strings, arrays, and other ordered data types). The distance metrics implement a Java interface, provided in the library, that enables easily implementing your own custom distance metrics and interfacing with JPT's other functionality. Additionally, JPT includes implementations of several algorithms for random sampling from arrays.
The JPT source code repository is hosted on GitHub; and is licensed under the GNU General Public License Version 3 (GPLv3). The API documentation is found on this site. The library is regularly published to Maven Central, from which it is easily imported into software projects using Maven and other commonly used build tools. Additionally, there is a GitHub repository of example programs that show basic usage of the JPT library, as well as replication programs that reproduce results found in published papers.
How to Cite
The JPT library originated as part of the research of Vincent A. Cicirello. If you use JPT in your research, please cite the following article which describes the library:
- Vincent A. Cicirello. JavaPermutationTools: A Java Library of Permutation Distance Metrics. Journal of Open Source Software, 3(31):950, November 2018. [PDF] [BIB] [DOI]
Permutation Distance Measures
The permutation distance measures include the following:
Edit distances: Edit distance measures, where an edit distance is the minimum number of edit operations necessary to transform one permutation into the other:
- block interchange distance: minimum number of block (i.e., contiguous sequence of elements) swaps necessary to transform one permutation into the other
- cycle edit distance: minimum number of non-singleton permutation cycles necessary to transform one permutation into the other
- edit distance: Wagner and Fischer's (1974) string edit distance adapted to permutations, including edit operations of element deletions, element insertions, and element changes, and with configurable costs for the three edit operations
- interchange distance: minimum number of swaps necessary to transform one permutation into the other
- Kendall tau distance: minimum number of adjacent swaps necessary to transform one permutation into the other, also known as bubble sort distance
- reinsertion distance: minimum number of removal/reinsertion operations necessary to transform one permutation into the other
Edge-based distances: Distance measures when the permutations represent a sequence of edges (e.g., when adjacent permutation elements correspond to edges):
- acyclic edge distance: the number of edges that are not in common
- cyclic edge distance: the number of edges that are not in common, treating the permutation as a cycle such that there is an edge from the last element to the first
- r-type distance: the number of directed edges that are not in common
- cyclic r-type distance: the number of directed edges that are not in common, treating the permutation as a cycle such that there is an edge from the last element to the first
Other distances: Distance measures based on some permutation concept or property:
- exact match distance: Hamming distance applied to permutations (e.g., the number of positions containing different elements)
- cycle distance: count of the number of non-singleton permutation cycles between a pair of permutations
- k-cycle distance: count of the number of non-singleton permutation cycles of length at most k
- deviation distance: sum of the positional deviations of the permutation elements
- normalized deviation distance (Ronald, 1998): deviation distance normalized by dividing by 1 (proposed by S Ronald at IEEE CEC in 1998)
- normalized deviation distance (Sevaux and Sorensen, 2005): deviation distance normalized, such that the distance is between 0 and 1 (proposed by Sevaux and Sorensen at MIC-2005)
- Lee distance: Lee distance (1958) is the sum of the minimum of the left-positional and right-positional deviations of the elements (e.g., treats permutation as a cycle, such that an element's deviation is the minimum of its displacement to the left or right)
- squared deviation distance: sum of the squares of the positional deviations of the permutation elements
- weighted Kendall tau distance: Liu and Han's (2006) weighted version of Kendall tau distance
Sequence Distance Measures
JPT also includes a variety of string distance measures that can be used to measure the distance between a pair of strings or between a pair of arrays of various types. The sequence distance measures all support computing the distance between pairs of arrays of any one primitive type, the distance between pairs of arrays of objects, the distance between pairs of lists, and the distance between pairs of String objects. The distance measures for sequences include the following:
- edit distance: Wagner and Fischer's (1974) string edit distance, including edit operations of element deletions, element insertions, and element changes, and with configurable costs
- exact match distance: an adaptation of Hamming distance to non-binary strings, such that the distance is the number of sequence positions containing different elements
- Kendall tau sequence distance: implementation of Kendall tau sequence distance (Cicirello, 2020)
- longest common subsequence distance: an edit distance where the edit operations are deletions and insertions (i.e., no replacements or changes), and where the cost of an edit operation is simply 1
Dependencies
The JavaPermutationTools has minimal dependencies, all of which are our own libraries, and these include the following.
- ρμ: The ρμ library is focused on efficient randomization. It provides highly optimized random sampling algorithm implementations, and replaces some of the functionality of Java's built-in random number generators with faster options.
- org.cicirello.core: The org.cicirello.core library provides implementations of core data structures and algorithms that are used across multiple of our projects.
Dependent Libraries
The following libraries depend upon JavaPermutationTools:
- Chips-n-Salsa: The Chips-n-Salsa Java library is a library for evolutionary computation, stochastic search, and related metaheuristics. Among its features is support for optimization problems over the space of permutations. Thus, it uses JPT's Permutation class as an efficient representation of permutations.
Importing from Maven Central
To import JPT from the Maven Central repository (if your build tool is Maven), add the following to the dependencies section of your “pom.xml” (for other build tools, see the tool's documentation for the equivalent):
<dependency>
<groupId>org.cicirello</groupId>
<artifactId>jpt</artifactId>
<version>x.y.z</version>
</dependency>
Just replace x.y.z
with your desired version of the library.
Java Modules
The library provides a Java module, org.cicirello.jpt. To use in your project, add the following to your module-info.java:
module your.module.name.here {
requires org.cicirello.jpt;
}
This module includes the org.cicirello.permutations and org.cicirello.sequences packages as well as their subpackages. See the API documentation for details.
Beginning with version 3.0.0, randomization and other math utilities, and some generic utilities, have been moved to a pair of new libraries ρμ and org.cicirello.core, which are now dependencies of JavaPermutationTools. Your dependency manager (see previous section) will handle downloading these for you.
To ease the transition of users of the library who may have been relying on those utilities, we have configured the module-info.java for the org.cicirello.jpt module to require these transitively so that your application should only need to require org.cicirello.jpt to access the functionality of those new modules.
Semantic Versioning
The JPT uses Semantic Versioning with version numbers of the form: MAJOR.MINOR.PATCH, where differences in MAJOR correspond to incompatible API changes, differences in MINOR correspond to introduction of backwards compatible new functionality, and PATCH corresponds to backwards compatible bug fixes.
Development Process
To maintain a high quality library, our development process utilizes the following:- Test Coverage: We require unit tests for all functionality. We generate JaCoCo test coverage reports on all push and pull-request events to the source code repository. Although we don't have a strict minimum coverage for acceptance, coverage is quite often at or near 100%. Current coverage is reported in the readme of the GitHub repository (updated automatically).
- Regression Testing: All new and existing tests are executed as part of all pull requests, and all must pass for acceptance.
- Static Analysis: We use multiple static analysis tools to scan for potential vulnerabilities, error prone code, and other potential code improvements. At the present time, this includes running:
Related Publications
The following papers, many of which are related to fitness landscape analysis, have either used this library, or contributed to it in some way, such as papers about algorithms that are included in the library.
- Vincent A. Cicirello. A Survey and Analysis of Evolutionary Operators for Permutations. In Proceedings of the 15th International Joint Conference on Computational Intelligence, pages 288-299. November 2023. doi:10.5220/0012204900003595. [PDF] [BIB] [DOI] [CODE]
- Vincent A. Cicirello. On Fitness Landscape Analysis of Permutation Problems: From Distance Metrics to Mutation Operator Selection. Mobile Networks and Applications, 28(2): 507-517, April 2023. doi:10.1007/s11036-022-02060-z. [PDF] [BIB] [DOI] [PUB] [arXiv] [CODE]
- Vincent A. Cicirello. Cycle Mutation: Evolving Permutations via Cycle Induction. Applied Sciences, 12(11), Article 5506, June 2022. doi:10.3390/app12115506. [PDF] [BIB] [DOI] [arXiv]
- Vincent A. Cicirello. Kendall Tau Sequence Distance: Extending Kendall Tau from Ranks to Sequences. Industrial Networks and Intelligent Systems, 7(23), e1, April 2020. [PDF] [BIB] [DOI]
- Vincent A. Cicirello. Kendall Tau Sequence Distance: Extending Kendall Tau from Ranks to Sequences. arXiv preprint arXiv:1905.02752 [cs.DM], May 2019. [PDF] [BIB] [arXiv]
- Vincent A. Cicirello. Classification of Permutation Distance Metrics for Fitness Landscape Analysis. In Proceedings of the 11th International Conference on Bio-inspired Information and Communications Technologies. ICST, March 2019. [PDF] [BIB]
- Vincent A. Cicirello. The Permutation in a Haystack Problem and the Calculus of Search Landscapes. IEEE Transactions on Evolutionary Computation, 20(3):434-446, June 2016. [PDF] [BIB] [DOI]
- Vincent A. Cicirello. On the Effects of Window-Limits on the Distance Profiles of Permutation Neighborhood Operators. In Proceedings of the 8th International Conference on Bio-inspired Information and Communications Technologies, pages 28-35. ICST, December 2014. [PDF] [BIB] [PUB]
- Vincent A. Cicirello and Robert Cernera. Profiling the Distance Characteristics of Mutation Operators for Permutation-Based Genetic Algorithms. In Proceedings of the Twenty-Sixth International Florida Artificial Intelligence Research Society Conference, pages 46-51. AAAI Press, May 2013. [PDF] [BIB] [PUB]