Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8279325

java.util.Scanner readMore() expandBuffer() OutOfMemoryError

XMLWordPrintable

      ADDITIONAL SYSTEM INFORMATION :

      A DESCRIPTION OF THE PROBLEM :
      Problem in use
      import java.util.Scanner;
      when scanning large files with
      public String findWithinHorizon(Pattern pattern, int horizon)
      Error.

      --------- beginning of crash
      E/AndroidRuntime: FATAL EXCEPTION: Thread-310
          Process: rx200.map_bd, PID: 6343
          java.lang.OutOfMemoryError: Failed to allocate a 134217740 byte allocation with 4194304 free bytes and 127MB until OOM
              ...Scanner.expandBuffer(Scanner.java:...)
              ...Scanner.readMore(Scanner.java:...)
              ...Scanner.findWithinHorizon(Scanner.java:...)

      Searching the web, I found that the OutOfMemoryError associated with the Scanner is quite common, here are examples:
      https://stackoverflow.com/questions/24978654/outofmemoryerror-when-reading-more-than-one-asset
      https://stackoverflow.com/questions/8094745/android-converting-an-xml-from-the-raw-folder-to-string
      https://stackoverflow.com/questions/26646308/issus-in-memory-managemant-with-json-object

      The error is not even so much in findWithinHorizon as in:
      private void readMore() и private void expandBuffer()

      The fact is that when scanning large files, the expandBuffer() method constantly doubles the buffer
      private CharBuffer buffer
      before increasing the buffer, it executes the code:
      char[] newBuffer = new char[newCapacity];
      Which leads to the error.
      Here is the code for the readMore() and expandBuffer() methods

      private void readMore() {
              int oldPosition = buffer.position();
              int oldBufferLength = bufferLength;
              if (bufferLength >= buffer.capacity()) {
                  expandBuffer();
              }
              int readCount = 0;
              try {
                  buffer.limit(buffer.capacity());
                  buffer.position(oldBufferLength);
                  while ((readCount = input.read(buffer)) == 0) {
                  }
              } catch (IOException e) {
                  bufferLength = buffer.position();
                  readCount = -1;
                  lastIOException = e;
              }
              buffer.flip();
              buffer.position(oldPosition);
              if (readCount == -1) {
                  inputExhausted = true;
              } else {
                  bufferLength = readCount + bufferLength;
              }
          }
          private void expandBuffer() {
              int oldPosition = buffer.position();
              int oldCapacity = buffer.capacity();
              int oldLimit = buffer.limit();
              int newCapacity = oldCapacity * 2;
              char[] newBuffer = new char[newCapacity];
              System.arraycopy(buffer.array(), 0, newBuffer, 0, oldLimit);
              buffer = CharBuffer.wrap(newBuffer, 0, newCapacity);
              buffer.position(oldPosition);
              buffer.limit(oldLimit);
          }

      As one of the options for solving this problem

      private int oldPosition, oldBufferLength;
      private void readMore() {
              oldPosition = buffer.position();
              oldBufferLength = bufferLength;
              if (bufferLength >= buffer.capacity()) {
                  expandBuffer();
              }
              int readCount = 0;
              try {
                  buffer.limit(buffer.capacity());
                  buffer.position(oldBufferLength);
                  while ((readCount = input.read(buffer)) == 0) {
                  }
              } catch (IOException e) {
                  bufferLength = buffer.position();
                  readCount = -1;
                  lastIOException = e;
              }
              buffer.flip();
              buffer.position(oldPosition);
              if (readCount == -1) {
                  inputExhausted = true;
              } else {
                  bufferLength = readCount + bufferLength;
              }
          }
          private void expandBuffer() {
              int oldPosition = buffer.position();
              int oldCapacity = buffer.capacity();
              int oldLimit = buffer.limit();
              int newCapacity = oldCapacity * 2;
              try {
      char[] newBuffer = new char[newCapacity];
      System.arraycopy(buffer.array(), 0, newBuffer, 0, oldLimit);
      buffer = CharBuffer.wrap(newBuffer, 0, newCapacity);
      buffer.position(oldPosition);
      buffer.limit(oldLimit);
      }catch (OutOfMemoryError e){
      int offset = 1024;
      int newSize = 2048;
      this.oldPosition = 0;
      oldBufferLength = offset;
      char[] newBuffer = new char[offset];
      System.arraycopy(buffer.array(), buffer.capacity() - offset, newBuffer, 0, offset);
      bufferLength = offset;
      buffer = CharBuffer.allocate(newSize);
      buffer.position(0);
      buffer.limit(offset);
      buffer.put(newBuffer);
      findStartIndex = findStartIndex % offset;
      }
          }

      P.S.
      I know that this option is not optimal, at least because if in the findWithinHorizon(Pattern pattern, int horizon) method, when scanning for the pattern pattern, more than, offset = 1024;
      then scanning will fail, and in general it is not optimal to reduce the buffer so much each time. but for my tasks this turned out to be enough, which I am sharing with you.
      It seems more optimal to me to increase the buffer until the OutOfMemoryError error, and then use it by rewriting the end of the text to the beginning of the buver and shifting the limit by this distance.
      But then I think it's better to abandon the local "char[] newBuffer = new char[offset];" since it will require memory which in the future may not be enough if other parts of the program take this memory. And use only buffer itself.
      Likewise, if the text got under the pattern, rewrite only what was after the last matching with pattern.
      But if the text and possible text of the match with pattern are very large, then there may not be a solution at all with the Scanner class.


            rgiulietti Raffaello Giulietti
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: