Details
Description
Summary
Add methods to Integer
and Long
to compress bits and expand bits, see Hacker's Delight (2nd edition), section 7.4.
Problem
Compressing or expanding bits of an int
or long
value can be composed to enable general permutations, and the "sheep and goats" operation (SAG) see Hacker's Delight (2nd edition), section 7.7. SAG can be used to perform a stable binary radix sort.
The compress and expand functionality maps efficiently to hardware instructions, such as PEXT
and PDEP
on x86 hardware. Thus the implementations can be very efficient on supporting hardware.
This paper investigates the beneficial performance impact of the PDEP
instruction, and by extension the expand
method, when applied to the implementation of a bitvector select
operation in succinct data structures (for example select(r)
returns the position of the r
th 1).
Such fundamental and useful operations, sedimented into modern hardware, are excellent candidates to be supported by the Java platform.
Solution
Add methods to compress bits and expand bits, given a bit mask.
Specification
On Integer
:
/**
* Returns the value obtained by compressing the bits of the
* specified {@code int} value, {@code i}, in accordance with
* the specified bit mask.
* <p>
* For each onebit value {@code mb} of the mask, from least
* significant to most significant, the bit value of {@code i} at
* the same bit location as {@code mb} is assigned to the compressed
* value contiguously starting from the least significant bit location.
* All the upper remaining bits of the compressed value are set
* to zero.
*
* @apiNote
* Consider the simple case of compressing the digits of a hexadecimal
* value:
* {@snippet lang="java" :
* // Compressing drink to food
* compress(0xCAFEBABE, 0xFF00FFF0) == 0xCABAB
* }
* Starting from the least significant hexadecimal digit at position 0
* from the right, the mask {@code 0xFF00FFF0} selects hexadecimal digits
* at positions 1, 2, 3, 6 and 7 of {@code 0xCAFEBABE}. The selected digits
* occur in the resulting compressed value contiguously from digit position
* 0 in the same order.
* <p>
* The following identities all return {@code true} and are helpful to
* understand the behaviour of {@code compress}:
* {@snippet lang="java" :
* // Returns 1 if the bit at position n is one
* compress(x, 1 << n) == (x >> n & 1)
*
* // Logical shift right
* compress(x, 1 << n) == x >>> n
*
* // Any bits not covered by the mask are ignored
* compress(x, m) == compress(x & m, m)
*
* // Compressing a value by itself
* compress(m, m) == (m == 1  m == 0) ? m : (1 << bitCount(m))  1
*
* // Expanding then compressing with the same mask
* compress(expand(x, m), m) == x & compress(m, m)
* }
* <p>
* The Sheep And Goats (SAG) operation (see Hacker's Delight, section 7.7)
* can be implemented as follows:
* {@snippet lang="java" :
* int compressLeft(int i, int mask) {
* // This implementation follows the description in Hacker's Delight which
* // is informative. A more optimal implementation is:
* // Integer.compress(i, mask) << Integer.bitCount(mask)
* return Integer.reverse(
* Integer.compress(Integer.reverse(i), Integer.reverse(mask)));
* }
*
* int sag(int i, int mask) {
* return compressLeft(i, mask)  Integer.compress(i, ~mask);
* }
*
* // Separate the sheep from the goats
* sag(0xCAFEBABE, 0xFF00FFF0) == 0xCABABFEE
* }
*
* @param i the value whose bits are to be compressed
* @param mask the bit mask
* @return the compressed value
* @see #expand
* @since 19
*/
public static int compress(int i, int mask)
/**
* Returns the value obtained by expanding the bits of the
* specified {@code int} value, {@code i}, in accordance with
* the specified bit mask.
* <p>
* For each onebit value {@code mb} of the mask, from least
* significant to most significant, the next contiguous bit value
* of {@code i} starting at the least significant bit is assigned
* to the expanded value at the same bit location as {@code mb}.
* All other remaining bits of the expanded value are set to zero.
*
* @apiNote
* Consider the simple case of expanding the digits of a hexadecimal
* value:
* {@snippet lang="java" :
* expand(0x0000CABAB, 0xFF00FFF0) == 0xCA00BAB0
* }
* Starting from the least significant hexadecimal digit at position 0
* from the right, the mask {@code 0xFF00FFF0} selects the first five
* hexadecimal digits of {@code 0x0000CABAB}. The selected digits occur
* in the resulting expanded value in order at positions 1, 2, 3, 6, and 7.
* <p>
* The following identities all return {@code true} and are helpful to
* understand the behaviour of {@code expand}:
* {@snippet lang="java" :
* // Logically shift right the bit at position 0
* expand(x, 1 << n) == (x & 1) << n
*
* // Logically shift right
* expand(x, 1 << n) == x << n
*
* // Expanding all bits returns the mask
* expand(1, m) == m
*
* // Any bits not covered by the mask are ignored
* expand(x, m) == expand(x, m) & m
*
* // Compressing then expanding with the same mask
* expand(compress(x, m), m) == x & m
* }
* <p>
* The select operation for determining the position of the onebit with
* index {@code n} in a {@code int} value can be implemented as follows:
* {@snippet lang="java" :
* int select(int i, int n) {
* // the onebit in i (the mask) with index n
* int nthBit = Integer.expand(1 << n, i);
* // the bit position of the onebit with index n
* return Integer.numberOfTrailingZeros(nthBit);
* }
*
* // The onebit with index 0 is at bit position 1
* select(0b10101010_10101010, 0) == 1
* // The onebit with index 3 is at bit position 7
* select(0b10101010_10101010, 3) == 7
* }
*
* @param i the value whose bits are to be expanded
* @param mask the bit mask
* @return the expanded value
* @see #compress
* @since 19
*/
public static int expand(int i, int mask)
On Long
:
/**
* Returns the value obtained by compressing the bits of the
* specified {@code long} value, {@code i}, in accordance with
* the specified bit mask.
* <p>
* For each onebit value {@code mb} of the mask, from least
* significant to most significant, the bit value of {@code i} at
* the same bit location as {@code mb} is assigned to the compressed
* value contiguously starting from the least significant bit location.
* All the upper remaining bits of the compressed value are set
* to zero.
*
* @apiNote
* Consider the simple case of compressing the digits of a hexadecimal
* value:
* {@snippet lang="java" :
* // Compressing drink to food
* compress(0xCAFEBABE, 0xFF00FFF0) == 0xCABAB
* }
* Starting from the least significant hexadecimal digit at position 0
* from the right, the mask {@code 0xFF00FFF0} selects hexadecimal digits
* at positions 1, 2, 3, 6 and 7 of {@code 0xCAFEBABE}. The selected digits
* occur in the resulting compressed value contiguously from digit position
* 0 in the same order.
* <p>
* The following identities all return {@code true} and are helpful to
* understand the behaviour of {@code compress}:
* {@snippet lang="java" :
* // Returns 1 if the bit at position n is one
* compress(x, 1 << n) == (x >> n & 1)
*
* // Logical shift right
* compress(x, 1 << n) == x >>> n
*
* // Any bits not covered by the mask are ignored
* compress(x, m) == compress(x & m, m)
*
* // Compressing a value by itself
* compress(m, m) == (m == 1  m == 0) ? m : (1 << bitCount(m))  1
*
* // Expanding then compressing with the same mask
* compress(expand(x, m), m) == x & compress(m, m)
* }
* <p>
* The Sheep And Goats (SAG) operation (see Hacker's Delight, section 7.7)
* can be implemented as follows:
* {@snippet lang="java" :
* long compressLeft(long i, long mask) {
* // This implementation follows the description in Hacker's Delight which
* // is informative. A more optimal implementation is:
* // Long.compress(i, mask) << Long.bitCount(mask)
* return Long.reverse(
* Long.compress(Long.reverse(i), Long.reverse(mask)));
* }
*
* long sag(long i, long mask) {
* return compressLeft(i, mask)  Long.compress(i, ~mask);
* }
*
* // Separate the sheep from the goats
* sag(0xCAFEBABE, 0xFF00FFF0) == 0xCABABFEE
* }
*
* @param i the value whose bits are to be compressed
* @param mask the bit mask
* @return the compressed value
* @see #expand
* @since 19
*/
public static long compress(long i, long mask)
/**
* Returns the value obtained by expanding the bits of the
* specified {@code long} value, {@code i}, in accordance with
* the specified bit mask.
* <p>
* For each onebit value {@code mb} of the mask, from least
* significant to most significant, the next contiguous bit value
* of {@code i} starting at the least significant bit is assigned
* to the expanded value at the same bit location as {@code mb}.
* All other remaining bits of the expanded value are set to zero.
*
* @apiNote
* Consider the simple case of expanding the digits of a hexadecimal
* value:
* {@snippet lang="java" :
* expand(0x0000CABAB, 0xFF00FFF0) == 0xCA00BAB0
* }
* Starting from the least significant hexadecimal digit at position 0
* from the right, the mask {@code 0xFF00FFF0} selects the first five
* hexadecimal digits of {@code 0x0000CABAB}. The selected digits occur
* in the resulting expanded value in order at positions 1, 2, 3, 6, and 7.
* <p>
* The following identities all return {@code true} and are helpful to
* understand the behaviour of {@code expand}:
* {@snippet lang="java" :
* // Logically shift right the bit at position 0
* expand(x, 1 << n) == (x & 1) << n
*
* // Logically shift right
* expand(x, 1 << n) == x << n
*
* // Expanding all bits returns the mask
* expand(1, m) == m
*
* // Any bits not covered by the mask are ignored
* expand(x, m) == expand(x, m) & m
*
* // Compressing then expanding with the same mask
* expand(compress(x, m), m) == x & m
* }
* <p>
* The select operation for determining the position of the onebit with
* index {@code n} in a {@code long} value can be implemented as follows:
* {@snippet lang="java" :
* long select(long i, long n) {
* // the onebit in i (the mask) with index n
* long nthBit = Long.expand(1 << n, i);
* // the bit position of the onebit with index n
* return Long.numberOfTrailingZeros(nthBit);
* }
*
* // The onebit with index 0 is at bit position 1
* select(0b10101010_10101010, 0) == 1
* // The onebit with index 3 is at bit position 7
* select(0b10101010_10101010, 3) == 7
* }
*
* @param i the value whose bits are to be expanded
* @param mask the bit mask
* @return the expanded value
* @see #compress
* @since 19
*/
public static long expand(long i, long mask)
Attachments
Issue Links
 csr of

JDK8283892 Compress and expand bits
 Resolved