Summary
Provide a simple, customizable hexdump facility for displaying binary data
Problem
For many years Java applications have used the unsupported JDK sun.misc.HexDumpEncoder
class to transform a byte array into a human-readable representation. The format generated by HexDumpEncoder
also includes the ASCII representation of the bytes which is helpful for humans when analyzing binary data. Although HexDumpEncoder
was in an unsupported sun.*
package, applications could still access it as restrictions were not enforced.
Solution
Provide a human-readable representation of binary data via the new java.util.HexFormat
class.
The HexFormat
class reinstates the previous hexadecimal dump capability. This new class supports static methods for converting to/from binary data and hexadecimal strings. It also supports several dump
methods, which generate the classic Unix *hexdump* format. The output format is customizable via the HexFormat.Formatter
interface.
Specification
jdk/src/java.base/share/classes/java/util/HexFormat.java
/**
* Converts binary data to and from its hexadecimal (base 16) string
* representation. It can also generate the classic Unix {@code hexdump(1)}
* format.
* <p>
* <b>Example usages:</b>
* <pre>{@code // Initialize a 16-byte array from a hexadecimal string
* byte[] bytes = HexFormat.fromString("a1a2a3a4a5a6a7a8a9aaabacadaeaf");
*
* // Display the hexadecimal representation of a file's 256-bit hash code
* MessageDigest sha256 = MessageDigest.getInstance("SHA-256");
* System.out.println(
* HexFormat.toString(sha256.digest(Files.readAllBytes(Paths.get("mydata")))));
*
* // Write the printable representation of a file to the standard output stream
* // in 64-byte chunks formatted according to the supplied Formatter function
* try (InputStream is = Files.newInputStream(Paths.get("mydata"))) {
* HexFormat.dumpAsStream(is, 64,
* (offset, chunk, fromIndex, toIndex) ->
* String.format("%d %s",
* offset / 64 + 1,
* HexFormat.toPrintableString(chunk, fromIndex, toIndex)))
* .forEachOrdered(System.out::println);
* } catch (IOException ioe) {
* ...
* }
*
* // Write the standard input stream to the standard output stream in hexdump format
* HexFormat.dump(System.in, System.out);
* }</pre>
*
* @since 12
*/
public final class HexFormat {
/**
* A formatter that generates the classic Unix {@code hexdump(1)} format.
* It behaves <i>as if</i>:
* <pre>{@code
* String.format("%08x %s |%s|",
* offset,
* HexFormat.toFormattedString(chunk, from, to),
* HexFormat.toPrintableString(chunk, from, to));
* }</pre>
*/
public static final Formatter HEXDUMP_FORMATTER = new Formatter() { }
/**
* Returns a hexadecimal string representation of the contents of the
* provided byte array, with no additional formatting.
* <p>
* The binary value is converted to a string comprising pairs of
* hexadecimal digits that use only the following ASCII characters:
* <blockquote>
* {@code 0123456789abcdef}
* </blockquote>
*
* @param bytes a byte buffer
* @return a hexadecimal string representation of the byte buffer.
* The string length is twice the buffer length.
* @throws NullPointerException if {@code bytes} is {@code null}
*/
public static String toString(byte[] bytes) { }
/**
* Returns a hexadecimal string representation of a <i>range</i> within the
* provided byte array, with no additional formatting.
* <p>
* The binary value is converted to a string comprising pairs of
* hexadecimal digits that use only the following ASCII characters:
* <blockquote>
* {@code 0123456789abcdef}
* </blockquote>
* The range to be converted extends from index {@code fromIndex},
* inclusive, to index {@code toIndex}, exclusive.
* (If {@code fromIndex==toIndex}, the range to be converted is empty.)
*
* @param bytes a byte buffer
* @param fromIndex the index of the first byte (inclusive) to be converted
* @param toIndex the index of the last byte (exclusive) to be converted
* @return a hexadecimal string representation of the byte buffer.
* The string length is twice the number of bytes converted.
* @throws NullPointerException if {@code bytes} is {@code null}
* @throws IllegalArgumentException if {@code fromIndex > toIndex}
* @throws ArrayIndexOutOfBoundsException
* if {@code fromIndex < 0} or {@code toIndex > bytes.length}
*/
public static String toString(byte[] bytes, int fromIndex, int toIndex) { }
/**
* Returns a formatted hexadecimal string representation of the contents of
* the provided byte array.
* <p>
* The binary value is converted to a string in the canonical hexdump
* format of two columns of eight space-separated pairs of hexadecimal
* digits that use only the following ASCII characters:
* <blockquote>
* {@code 0123456789abcdef}
* </blockquote>
* <p>
* If the number of bytes to be converted is greater than 16 then
* {@link System#lineSeparator()} characters are inserted after each 16-byte chunk.
* If the final chunk is less than 16 bytes then the result is padded with spaces
* to match the length of the preceding chunks.
* The general output format is as follows:
* <pre>
* 00 11 22 33 44 55 66 77 88 99 aa bb cc dd ee ff
* </pre>
*
* @param bytes a byte buffer
* @return a formatted hexadecimal string representation of the byte buffer
* @throws NullPointerException if {@code bytes} is {@code null}
*/
public static String toFormattedString(byte[] bytes) { }
/**
* Returns a formatted hexadecimal string representation of the contents of
* a <i>range</i> within the provided byte array.
* <p>
* The binary value is converted to a string in the canonical hexdump
* format of two columns of eight space-separated pairs of hexadecimal
* digits that use only the following ASCII characters:
* <blockquote>
* {@code 0123456789abcdef}
* </blockquote>
* <p>
* The range to be converted extends from index {@code fromIndex},
* inclusive, to index {@code toIndex}, exclusive.
* (If {@code fromIndex==toIndex}, the range to be converted is empty.)
* <p>
* If the number of bytes to be converted is greater than 16 then
* {@link System#lineSeparator()} characters are inserted after each 16-byte chunk.
* If the final chunk is less than 16 bytes then the result is padded with spaces
* to match the length of the preceding chunks.
* The general output format is as follows:
* <pre>
* 00 11 22 33 44 55 66 77 88 99 aa bb cc dd ee ff
* </pre>
*
* @param bytes a byte buffer
* @param fromIndex the index of the first byte (inclusive) to be converted
* @param toIndex the index of the last byte (exclusive) to be converted
* @return a formatted hexadecimal string representation of the byte buffer
* @throws NullPointerException if {@code bytes} is {@code null}
* @throws IllegalArgumentException if {@code fromIndex > toIndex}
* @throws ArrayIndexOutOfBoundsException
* if {@code fromIndex < 0} or {@code toIndex > bytes.length}
*/
public static String toFormattedString(byte[] bytes, int fromIndex,
int toIndex) { }
/**
* Returns a printable representation of the contents of the
* provided byte array.
* <p>
* The binary value is converted to a string comprising printable
* {@link StandardCharsets#ISO_8859_1}
* characters, or {@code '.'} if the byte maps to a non-printable character.
* A non-printable character is one outside of the range
* {@code '\u005Cu0020'} through {@code '\u005Cu007E'} and
* {@code '\u005Cu00A0'} through {@code '\u005Cu00FF'}.
*
* @param bytes a byte buffer
* @return a printable representation of the byte buffer
* @throws NullPointerException if {@code bytes} is {@code null}
*/
public static String toPrintableString(byte[] bytes) { }
/**
* Returns a printable representation of the contents of a
* <i>range</i> within the provided byte array.
* <p>
* The binary value is converted to a string comprising printable
* {@link StandardCharsets#ISO_8859_1}
* characters, or {@code '.'} if the byte maps to a non-printable character.
* A non-printable character is one outside of the range
* {@code '\u005Cu0020'} through {@code '\u005Cu007E'} and
* {@code '\u005Cu00A0'} through {@code '\u005Cu00FF'}.
*
* @param bytes a byte buffer
* @param fromIndex the index of the first byte (inclusive) to be converted
* @param toIndex the index of the last byte (exclusive) to be converted
* @return a printable representation of the byte buffer
* @throws NullPointerException if {@code bytes} is {@code null}
* @throws IllegalArgumentException if {@code fromIndex > toIndex}
* @throws ArrayIndexOutOfBoundsException
* if {@code fromIndex < 0} or {@code toIndex > bytes.length}
*/
public static String toPrintableString(byte[] bytes, int fromIndex,
int toIndex) { }
/**
* Returns a byte array containing the provided sequence of hexadecimal
* digits. The sequence may be prefixed with the hexadecimal indicator
* {@code "0x"}.
* <p>
* The binary value is generated from pairs of hexadecimal digits that use
* only the following ASCII characters:
* <blockquote>
* {@code 0123456789abcdefABCDEF}
* </blockquote>
*
* @param hexString an even numbered sequence of hexadecimal digits
* @return a byte buffer
* @throws IllegalArgumentException if {@code hexString} has an odd number
* of digits or contains an illegal hexadecimal character
* @throws NullPointerException if {@code hexString} is {@code null}
*/
public static byte[] fromString(CharSequence hexString) { }
/**
* Returns a byte array containing a <i>range</i> within the provided
* sequence of hexadecimal digits. The sequence may be prefixed with the
* hexadecimal indicator {@code "0x"}.
* <p>
* The binary value is generated from pairs of hexadecimal digits that use
* only the following ASCII characters:
* <blockquote>
* {@code 0123456789abcdefABCDEF}
* </blockquote>
*
* @param hexString an even numbered sequence of hexadecimal digits
* @param fromIndex the index of the first digit (inclusive) to be converted
* @param toIndex the index of the last digit (exclusive) to be converted
* @return a byte buffer
* @throws IllegalArgumentException if {@code hexString} has an odd number
* of digits or contains an illegal hexadecimal character,
* or if {@code fromIndex > toIndex}
* @throws NullPointerException if {@code hexString} is {@code null}
* @throws ArrayIndexOutOfBoundsException
* if {@code fromIndex < 0} or {@code toIndex > hexString.length()}
*/
public static byte[] fromString(CharSequence hexString, int fromIndex,
int toIndex) { }
/**
* Generates a dump of the contents of the provided input stream, as a
* stream of hexadecimal strings in hexdump format.
* This method outputs the same format as
* {@link #dump(byte[],OutputStream)},
* without the {@link System#lineSeparator()} characters.
* <p>
* If the input is not a multiple of 16 bytes then the final chunk will
* be shorter than the preceding chunks. The result will be padded with
* spaces to match the length of the preceding chunks.
* <p>
* On return, the input stream will be at end-of-stream.
* This method does not close the input stream and may block indefinitely
* reading from it. The behavior for the case where it is
* <i>asynchronously closed</i>, or the thread interrupted,
* is highly input stream specific, and therefore not specified.
* <p>
* If an I/O error occurs reading from the input stream then it may not be
* at end-of-stream and may be in an inconsistent state. It is strongly
* recommended that the input stream be promptly closed if an I/O error
* occurs.
*
* @param in the input stream, non-null
* @return a new infinite sequential ordered stream of hexadecimal strings
*/
public static Stream<String> dumpAsStream(InputStream in) { }
/**
* Generates a dump of the contents of the provided input stream, as a
* stream of formatted hexadecimal strings. Each string is formatted
* according to the {@code formatter} function, if present. Otherwise,
* this method outputs the same format as
* {@link #dump(byte[],OutputStream)},
* without the {@link System#lineSeparator()} characters.
* <p>
* On return, the input stream will be at end-of-stream.
* This method does not close the input stream and may block indefinitely
* reading from it. The behavior for the case where it is
* <i>asynchronously closed</i>, or the thread interrupted,
* is highly input stream specific, and therefore not specified.
* <p>
* If an I/O error occurs reading from the input stream then it may not be
* at end-of-stream and may be in an inconsistent state. It is strongly
* recommended that the input stream be promptly closed if an I/O error
* occurs.
* <p>
* If an error occurs in the {@code formatter} then an unchecked exception
* will be thrown from the underlying {@code Stream} method.
*
* @param in the input stream, non-null
* @param chunkSize the number of bytes-per-chunk (typically 16)
* @param formatter a hexdump formatting function, or {@code null}
* @return a new infinite sequential ordered stream of hexadecimal strings
* @throws NullPointerException if {@code in} is {@code null}
*/
public static Stream<String> dumpAsStream(InputStream in, int chunkSize,
Formatter formatter) { }
/**
* Generates a dump of the contents of the provided byte array, as a stream
* of hexadecimal strings in hexdump format.
* This method outputs the same format as
* {@link #dump(byte[],OutputStream)},
* without the {@link System#lineSeparator()} characters.
* <p>
* If the input is not a multiple of 16 bytes then the final chunk will
* be shorter than the preceding chunks. The result will be padded with
* spaces to match the length of the preceding chunks.
*
* @param bytes a byte buffer, assumed to be unmodified during use
* @return a new sequential ordered stream of hexadecimal strings
* @throws NullPointerException if {@code bytes} is {@code null}
*/
public static Stream<String> dumpAsStream(byte[] bytes) { }
/**
* Generates a dump of the contents of a <i>range</i> within the provided
* byte array, as a stream of formatted hexadecimal strings. Each string is
* formatted according to the {@code formatter} function, if present.
* Otherwise, this method outputs the same format as
* {@link #dump(byte[],OutputStream)},
* without the {@link System#lineSeparator()} characters.
* <p>
* The range to be converted extends from index {@code fromIndex},
* inclusive, to index {@code toIndex}, exclusive.
* (If {@code fromIndex==toIndex}, the range to be converted is empty.)
* If the input is not a multiple of {@code chunkSize} then the final chunk
* will be shorter than the preceding chunks. The result may be padded with
* spaces to match the length of the preceding chunks.
* <p>
* If an error occurs in the {@code formatter} then an unchecked exception
* will be thrown from the underlying {@code Stream} method.
*
* @param bytes a byte buffer, assumed to be unmodified during use
* @param fromIndex the index of the first byte (inclusive) to be converted
* @param toIndex the index of the last byte (exclusive) to be converted
* @param chunkSize the number of bytes-per-chunk (typically 16)
* @param formatter a hexdump formatting function, or {@code null}
* @return a new sequential ordered stream of hexadecimal strings
* @throws NullPointerException if {@code bytes} is {@code null}
* @throws IllegalArgumentException if {@code fromIndex > toIndex}
* @throws ArrayIndexOutOfBoundsException
* if {@code fromIndex < 0} or {@code toIndex > bytes.length}
*/
public static Stream<String> dumpAsStream(byte[] bytes, int fromIndex,
int toIndex, int chunkSize, Formatter formatter) { }
/**
* Generates a dump of the contents of a <i>range</i> within the provided
* ByteBuffer, as a stream of formatted hexadecimal strings. Each string is
* formatted according to the {@code formatter} function, if present.
* Otherwise, this method outputs the same format as
* {@link #dump(byte[],OutputStream)},
* without the {@link System#lineSeparator()} characters.
* <p>
* The range to be converted extends from index {@code fromIndex},
* inclusive, to index {@code toIndex}, exclusive.
* (If {@code fromIndex==toIndex}, the range to be converted is empty.)
* If the input is not a multiple of {@code chunkSize} then the final chunk
* will be shorter than the preceding chunks. The result may be padded with
* spaces to match the length of the preceding chunks.
* <p>
* If an error occurs in the {@code formatter} then an unchecked exception
* will be thrown from the underlying {@code Stream} method.
*
* @param buffer a byte buffer, assumed to be unmodified during use
* @param fromIndex the index of the first byte (inclusive) to be converted
* @param toIndex the index of the last byte (exclusive) to be converted
* @param chunkSize the number of bytes-per-chunk (typically 16)
* @param formatter a hexdump formatting function, or {@code null}
* @return a new sequential ordered stream of hexadecimal strings
* @throws NullPointerException if {@code buffer} is {@code null}
* @throws IllegalArgumentException if {@code fromIndex > toIndex}
* @throws ArrayIndexOutOfBoundsException
* if {@code fromIndex < 0} or {@code toIndex > buffer.remaining()}
*/
public static Stream<String> dumpAsStream(ByteBuffer buffer, int fromIndex,
int toIndex, int chunkSize, Formatter formatter) { }
/**
* Generates a hexadecimal dump of the contents of the provided byte array
* and writes it to the provided output stream.
* This method behaves <i>as if</i>:
* <pre>{@code
* HexFormat.dumpAsStream(bytes, 16,
* (offset, chunk, from, to) ->
* String.format("%08x %s |%s|",
* offset,
* HexFormat.toFormattedString(chunk, from, to),
* HexFormat.toPrintableString(chunk, from, to)))
* .forEachOrdered(PrintStream::println);
* }</pre>
* <p>
* This method does not close the output stream and may block indefinitely
* writing to it. The behavior for the case where it is
* <i>asynchronously closed</i>, or the thread interrupted,
* is highly output stream specific, and therefore not specified.
* <p>
* If an I/O error occurs writing to the output stream, then it may be
* in an inconsistent state. It is strongly recommended that the output
* stream be promptly closed if an I/O error occurs.
*
* @param bytes the byte buffer, assumed to be unmodified during use
* @param out the output stream, non-null
* @throws IOException if an I/O error occurs when writing
* @throws NullPointerException if {@code bytes} or {@code out} is
* {@code null}
*/
public static void dump(byte[] bytes, OutputStream out) throws IOException { }
/**
* Generates a hexadecimal dump of the contents of a <i>range</i> within the
* provided byte array and writes it to the provided output stream.
* This method outputs the same format as
* {@link #dump(byte[],OutputStream)}.
* <p>
* The range to be converted extends from index {@code fromIndex},
* inclusive, to index {@code toIndex}, exclusive.
* (If {@code fromIndex==toIndex}, the range to be converted is empty.)
* <p>
* This method does not close the output stream and may block indefinitely
* writing to it. The behavior for the case where it is
* <i>asynchronously closed</i>, or the thread interrupted,
* is highly output stream specific, and therefore not specified.
* <p>
* If an I/O error occurs writing to the output stream, then it may be
* in an inconsistent state. It is strongly recommended that the output
* stream be promptly closed if an I/O error occurs.
*
* @param bytes the byte buffer, assumed to be unmodified during use
* @param fromIndex the index of the first byte (inclusive) to be converted
* @param toIndex the index of the last byte (exclusive) to be converted
* @param out the output stream, non-null
* @throws IOException if an I/O error occurs when writing
* @throws NullPointerException if {@code bytes} or {@code out} is
* {@code null}
* @throws IllegalArgumentException if {@code fromIndex > toIndex}
* @throws ArrayIndexOutOfBoundsException
* if {@code fromIndex < 0} or {@code toIndex > bytes.length}
*/
public static void dump(byte[] bytes, int fromIndex, int toIndex,
OutputStream out) throws IOException { }
/**
* Generates a hexadecimal dump of the contents of the provided input stream
* and writes it to the provided output stream.
* This method outputs the same format as
* {@link #dump(byte[],OutputStream)}.
* <p>
* Reads all bytes from the input stream.
* On return, the input stream will be at end-of-stream. This method does
* not close either stream and may block indefinitely reading from the
* input stream, or writing to the output stream. The behavior for the case
* where the input and/or output stream is <i>asynchronously closed</i>,
* or the thread interrupted, is highly input stream and output stream
* specific, and therefore not specified.
* <p>
* If an I/O error occurs reading from the input stream or writing to the
* output stream, then it may do so after some bytes have been read or
* written. Consequently the input stream may not be at end-of-stream and
* one, or both, streams may be in an inconsistent state. It is strongly
* recommended that both streams be promptly closed if an I/O error occurs.
*
* @param in the input stream, non-null
* @param out the output stream, non-null
* @throws IOException if an I/O error occurs when reading or writing
* @throws NullPointerException if {@code in} or {@code out} is {@code null}
*/
public static void dump(InputStream in, OutputStream out)
throws IOException { }
References
webrev: http://cr.openjdk.java.net/~vinnie/8170769/webrev.07/
javadoc: http://cr.openjdk.java.net/~vinnie/8170769/javadoc.07/api/java.base/java/util/HexFormat.html