-
Bug
-
Resolution: Cannot Reproduce
-
P3
-
None
-
6u16
-
x86
-
solaris_2.5.1
FULL PRODUCT VERSION :
1.6.0_16 (Windows)
MacOS Apple standard version
ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows [Version 6.1.7600]
MacOS X 10.6.2
A DESCRIPTION OF THE PROBLEM :
Our program uses regular expressions (Pattern and Matcher) to extract contents from webpages. We use several threads running in a ThreadPoolExecutor and are collected by a ExecutorCompletionService. Under MacOS, an out of heap memory error was the result (at a memory usage of about 150 MB), under Windows, the memory usage of the program increased "unlimited" (we stopped testing at 700 MB of memory usage).
We are pretty sure that the leak comes from the matcher. We used the tool VisualVM to get a heap dump. The dump showed an increasing amount of char arrays during the program execution. We checked the content of these char arrays, and found out, that they contained the source of the pages.
To ensure that the class Matcher is responsible, we added a keyword to the content string (see example). The memory dump showed this keyword in every related char array.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
The following code produces the memory leak. It is used within a thread (implemented by ThreadPoolExecutor). We use thousands of threads to scan known webpages for a specific content, and we noticed that the instance of Matcher is not removed from memory by the Garbage Collector after the method returned (or even after the thread has finished).
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
We expected that the memory usage after the threads have finished should be almost equal to the memory usage before they started.
ACTUAL -
Memory is not correctly cleared after the threads finished.
ERROR MESSAGES/STACK TRACES THAT OCCUR :
java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
public class LinkFinderThread implements Callable<ArrayList<String>> {
private final String regex = "http://image\\.skins\\.be/[0-9]+/[0-9a-zA-Z\\-]+/";
private String url;
public LinkFinderThread(String url) {
this.url = url;
}
public ArrayList<String> call() {
ArrayList<String> result = new ArrayList<String>();
String htmlPage = this.getHTMLDoc();
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(htmlPage);
while (matcher.find()) {
result.add(matcher.group());
}
return result;
}
private String getHTMLDoc() {
StringBuilder htmlPage = new StringBuilder();
try {
InputStream is = new URL(this.url).openStream();
Scanner scanner = new Scanner(is);
while (scanner.hasNextLine()) {
htmlPage.append(scanner.nextLine());
}
scanner.close();
is.close();
} catch (Exception e) {
}
return htmlPage.toString();
}
}
public class LinkFinderThreadPool {
private ThreadPoolExecutor tpe;
private CompletionService<ArrayList<String>> ecs;
public LinkFinderThreadPool(int poolSize) {
this.tpe = new ThreadPoolExecutor(poolSize, poolSize, 3, TimeUnit.SECONDS, new LinkedBlockingDeque<Runnable>() );
this.ecs = new ExecutorCompletionService<ArrayList<String>>(tpe);
}
public void addJob(String url) {
ecs.submit(new LinkFinderThread(url));
}
public ArrayList<String> getResults() {
try {
return ecs.take().get();
}
catch (Exception e) {
}
return null;
}
public int getActiveCount()
{
return this.tpe.getActiveCount();
}
public void shutdown()
{
tpe.shutdown();
}
public static void main(String[] args) {
LinkFinderThreadPool pool = new LinkFinderThreadPool(10);
final int count = 50;
for (int i = 0; i < count; i++) {
pool.addJob("http://forum.skins.be/9-hq-celebrity-pictures/56-jessica-alba-babe-tournament-2008-winner/");
}
ArrayList<String> results;
int i = 0;
while (pool.getActiveCount() > 0) {
results = pool.getResults();
if (results == null) continue;
if (i % 10 == 0) {
System.out.println(i);
}
i++;
}
pool.shutdown();
System.out.println("done - infinity loop");
while (true) { }
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
We don't have any workaround.
1.6.0_16 (Windows)
MacOS Apple standard version
ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows [Version 6.1.7600]
MacOS X 10.6.2
A DESCRIPTION OF THE PROBLEM :
Our program uses regular expressions (Pattern and Matcher) to extract contents from webpages. We use several threads running in a ThreadPoolExecutor and are collected by a ExecutorCompletionService. Under MacOS, an out of heap memory error was the result (at a memory usage of about 150 MB), under Windows, the memory usage of the program increased "unlimited" (we stopped testing at 700 MB of memory usage).
We are pretty sure that the leak comes from the matcher. We used the tool VisualVM to get a heap dump. The dump showed an increasing amount of char arrays during the program execution. We checked the content of these char arrays, and found out, that they contained the source of the pages.
To ensure that the class Matcher is responsible, we added a keyword to the content string (see example). The memory dump showed this keyword in every related char array.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
The following code produces the memory leak. It is used within a thread (implemented by ThreadPoolExecutor). We use thousands of threads to scan known webpages for a specific content, and we noticed that the instance of Matcher is not removed from memory by the Garbage Collector after the method returned (or even after the thread has finished).
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
We expected that the memory usage after the threads have finished should be almost equal to the memory usage before they started.
ACTUAL -
Memory is not correctly cleared after the threads finished.
ERROR MESSAGES/STACK TRACES THAT OCCUR :
java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
public class LinkFinderThread implements Callable<ArrayList<String>> {
private final String regex = "http://image\\.skins\\.be/[0-9]+/[0-9a-zA-Z\\-]+/";
private String url;
public LinkFinderThread(String url) {
this.url = url;
}
public ArrayList<String> call() {
ArrayList<String> result = new ArrayList<String>();
String htmlPage = this.getHTMLDoc();
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(htmlPage);
while (matcher.find()) {
result.add(matcher.group());
}
return result;
}
private String getHTMLDoc() {
StringBuilder htmlPage = new StringBuilder();
try {
InputStream is = new URL(this.url).openStream();
Scanner scanner = new Scanner(is);
while (scanner.hasNextLine()) {
htmlPage.append(scanner.nextLine());
}
scanner.close();
is.close();
} catch (Exception e) {
}
return htmlPage.toString();
}
}
public class LinkFinderThreadPool {
private ThreadPoolExecutor tpe;
private CompletionService<ArrayList<String>> ecs;
public LinkFinderThreadPool(int poolSize) {
this.tpe = new ThreadPoolExecutor(poolSize, poolSize, 3, TimeUnit.SECONDS, new LinkedBlockingDeque<Runnable>() );
this.ecs = new ExecutorCompletionService<ArrayList<String>>(tpe);
}
public void addJob(String url) {
ecs.submit(new LinkFinderThread(url));
}
public ArrayList<String> getResults() {
try {
return ecs.take().get();
}
catch (Exception e) {
}
return null;
}
public int getActiveCount()
{
return this.tpe.getActiveCount();
}
public void shutdown()
{
tpe.shutdown();
}
public static void main(String[] args) {
LinkFinderThreadPool pool = new LinkFinderThreadPool(10);
final int count = 50;
for (int i = 0; i < count; i++) {
pool.addJob("http://forum.skins.be/9-hq-celebrity-pictures/56-jessica-alba-babe-tournament-2008-winner/");
}
ArrayList<String> results;
int i = 0;
while (pool.getActiveCount() > 0) {
results = pool.getResults();
if (results == null) continue;
if (i % 10 == 0) {
System.out.println(i);
}
i++;
}
pool.shutdown();
System.out.println("done - infinity loop");
while (true) { }
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
We don't have any workaround.