Binary search is useful, but hard to implement correctly. So, helpful library support should be provided.
JDK provides some support for binary search with overloaded Collections.binarySearch and Arrays.binarySearch methods. While these methods are more useful than typical analogues elsewhere, JDK can do better still. There are two aspects that can be improved:
- binarySearch assumes that a client has a list or an array. This is needlessly restrictive. It's enough to know the bounds of an integer range in which to perform binary search.
- binarySearch returns the index of a found element. This falls short if that element is non-unique and a particular occurrence is required.
Both aspects can be addressed by adding a method with the following specification:
/**
* Searches a range for the smallest number that matches a predicate.
*
* <p> The predicate must be such that if there exists a number {@code i},
* {@code from <= i < to}, that matches the predicate, then all numbers
* {@code j}, {@code i < j < to}, must also match the predicate. If the
* predicate doesn't meet this criteria, then the result is unspecified.
*
* <p> If there doesn't exist such a number {@code i}, {@code from <= i < to},
* that matches the predicate, then this method returns {@code to}.
* Otherwise, this method returns {@code i} such that there exists no
* {@code j}, {@code from <= j < i}, that also matches the predicate.
*
* @param from inclusive
* @param to exclusive
* @param p predicate
* @throws IllegalArgumentException if {@code from > to}
* @throws NullPointerException if {@code p == null}
* @return the smallest number {@code i}, {@code from <= i < to}, such that
* {@code p.test(i) == true}; otherwise {@code to},
* if there's no such number
*/
public static int binarySearch(int from, int to, IntPredicate p)
Usage examples of the proposed method:
- Find the leftmost key (analogous to std::lower_bound)
public static int lowerBound(int[] a, int fromIndex, int toIndex, int key) {
return binarySearch(fromIndex, toIndex, i -> key <= a[i]);
}
- Find the rightmost key (analogous to std::upper_bound)
public static int upperBound(int[] a, int fromIndex, int toIndex, int key) {
return binarySearch(fromIndex, toIndex, i -> key < a[i]);
}
- Find the range within which all elements are key (analogous to std::equal_range)
public record Range(int from, int to) { }
public static Range equalRange(int[] a, int fromIndex, int toIndex, int key) {
int lo = lowerBound(a, fromIndex, toIndex, key);
int hi = upperBound(a, fromIndex, toIndex, key);
return new Range(lo, hi);
}
- Find the key or the insertion point (mimics existing Arrays.binarySearch, with the difference of finding the leftmost key, if any)
public static int binarySearch(int[] a, int fromIndex, int toIndex, int key) {
int insertion_point = binarySearch(fromIndex, toIndex, i -> key <= a[i]);
return insertion_point != a.length && a[insertion_point] == key
? insertion_point : -insertion_point - 1;
}
Here are some issues with the proposed method:
- It's not clear how to name and where to put it. There's nothing in the signature that binds the method to Collections or Arrays.
- It's not clear whether the method should accept negative from and to. Again, no lists or arrays are involved, so no 0-based indexing is required.
- Which overloads, if any, should the method have? For example, this overload is probably a good idea:
public static long binarySearch(long from, long to, LongPredicate p)
Are there any other useful overloads?
JDK provides some support for binary search with overloaded Collections.binarySearch and Arrays.binarySearch methods. While these methods are more useful than typical analogues elsewhere, JDK can do better still. There are two aspects that can be improved:
- binarySearch assumes that a client has a list or an array. This is needlessly restrictive. It's enough to know the bounds of an integer range in which to perform binary search.
- binarySearch returns the index of a found element. This falls short if that element is non-unique and a particular occurrence is required.
Both aspects can be addressed by adding a method with the following specification:
/**
* Searches a range for the smallest number that matches a predicate.
*
* <p> The predicate must be such that if there exists a number {@code i},
* {@code from <= i < to}, that matches the predicate, then all numbers
* {@code j}, {@code i < j < to}, must also match the predicate. If the
* predicate doesn't meet this criteria, then the result is unspecified.
*
* <p> If there doesn't exist such a number {@code i}, {@code from <= i < to},
* that matches the predicate, then this method returns {@code to}.
* Otherwise, this method returns {@code i} such that there exists no
* {@code j}, {@code from <= j < i}, that also matches the predicate.
*
* @param from inclusive
* @param to exclusive
* @param p predicate
* @throws IllegalArgumentException if {@code from > to}
* @throws NullPointerException if {@code p == null}
* @return the smallest number {@code i}, {@code from <= i < to}, such that
* {@code p.test(i) == true}; otherwise {@code to},
* if there's no such number
*/
public static int binarySearch(int from, int to, IntPredicate p)
Usage examples of the proposed method:
- Find the leftmost key (analogous to std::lower_bound)
public static int lowerBound(int[] a, int fromIndex, int toIndex, int key) {
return binarySearch(fromIndex, toIndex, i -> key <= a[i]);
}
- Find the rightmost key (analogous to std::upper_bound)
public static int upperBound(int[] a, int fromIndex, int toIndex, int key) {
return binarySearch(fromIndex, toIndex, i -> key < a[i]);
}
- Find the range within which all elements are key (analogous to std::equal_range)
public record Range(int from, int to) { }
public static Range equalRange(int[] a, int fromIndex, int toIndex, int key) {
int lo = lowerBound(a, fromIndex, toIndex, key);
int hi = upperBound(a, fromIndex, toIndex, key);
return new Range(lo, hi);
}
- Find the key or the insertion point (mimics existing Arrays.binarySearch, with the difference of finding the leftmost key, if any)
public static int binarySearch(int[] a, int fromIndex, int toIndex, int key) {
int insertion_point = binarySearch(fromIndex, toIndex, i -> key <= a[i]);
return insertion_point != a.length && a[insertion_point] == key
? insertion_point : -insertion_point - 1;
}
Here are some issues with the proposed method:
- It's not clear how to name and where to put it. There's nothing in the signature that binds the method to Collections or Arrays.
- It's not clear whether the method should accept negative from and to. Again, no lists or arrays are involved, so no 0-based indexing is required.
- Which overloads, if any, should the method have? For example, this overload is probably a good idea:
public static long binarySearch(long from, long to, LongPredicate p)
Are there any other useful overloads?
- relates to
-
JDK-8271168 Add canned midpoint methods for use by core libs code
- New
- links to