A customer uses a server engine, written in Java, to manage interactive
sessions between hundreds of clients. Each client connection takes
a thread. They need to be able to support thousands of users on their
server. CPU load testing shows that this should be possible given a
large enough server, but the JVM always crashes when the user count goes
slightly over 1000. The customer has tracked the bug down to
jdk-1.1.6-src/src/solaris/java/native_threads/src/threads_md.c
Where an parameter called MAX_LWP is defined. When they exceed MAX_LWP
threads in their server, the next GC causes the system to die. Here
is the customer's description:
-------------------------------------------
in my tests something bad (e.g. segfault or hard loop) always happens as
soon as you GC with more than 1024 threads. for example, try unlimiting
file descriptors (otherwise you'll die immediately) and then running it with
jre -verbosegc -ss2k -oss2k -nojit Threads 46738 520
(those jre options aren't necessary to provoke the bug; they just minimize
the load on the system in getting there.)
what's going on? well,
jdk-1.1.6-src/src/solaris/java/native_threads/src/threads_md.c
includes the following code:
#define MAX_LWPS 1024
static prstatus_t Mystatus;
static id_t lwpid_list_buf[MAX_LWPS];
static id_t oldlwpid_list_buf[MAX_LWPS];
static sys_thread_t *onproct_list_buf[MAX_LWPS];
(i.e. it declares a constant that sounds an awful lot like "max lightweight
processes", and then sizes a bunch of tables). more discouraging, it never
seems to check against this limit before assigning into those tables, so i
suspect the answer is that the first time you GC after you get more than
1024 LWPs you'll start smashing memory (these tables appear to be used for
suspending the LWPs during garbage collection).
------------------------------------------------
The customer also wrote a small test program which reproduces the problem
without having to run their server.
import java.io.*;
import java.net.*;
public class Threads extends Thread {
private static int port;
private static InetAddress localHost;
private Socket s;
private Threads (Socket s) {
this.s = s;
}
public void run () {
try {
if (s == null)
s = new Socket(localHost, port);
byte[] buf = new byte[4];
s.getInputStream().read(buf, 0, buf.length);
} catch (Exception e) {
e.printStackTrace();
}
System.err.print("!");
}
public static void main (String[] args) throws Exception {
if (args.length != 2) {
System.err.println("use: threads <port> <gcCount>");
System.exit(1);
}
port = Integer.parseInt(args[0]);
int gcCount = Integer.parseInt(args[1]);
localHost = InetAddress.getLocalHost();
ServerSocket ss = new ServerSocket(port);
for (int i = 0; ; ++i) {
new Threads(null).start();
Socket s = ss.accept();
new Threads(s).start();
System.err.print("(" + i + ")");
System.err.flush();
Thread.sleep(5);
if (((i + 1) % gcCount) == 0)
System.gc();
}
}
}
steve.fritzinger@East 1998-08-24
============================================================================
I got a libthread panic error running the test program on build H (transcript
below), but this did NOT happen on build I. (It still ends up with a
SocketException (Too many open files).)
Maybe this was fixed in I????
%/usr/local/java/jdk1.3bak/solaris/bin/java -version
java version "1.3.0"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0-H)
Java HotSpot (TM) Client VM (build 1.3-H, interpreted mode)
%/usr/local/java/jdk1.3bak/solaris/bin/java -cp . Threads 9999 520
(0)(1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30)(31)(32)(33)(34)(35)(36)(37)(38)(39)(40)(41)(42)(43)(44)(45)(46)(47)(48)(49)(50)(51)(52)(53)(54)(55)(56)(57)(58)(59)(60)(61)(62)(63)(64)(65)(66)(67)(68)(69)(70)(71)(72)(73)(74)(75)(76)(77)(78)(79)(80)(81)(82)(83)(84)(85)(86)(87)(88)(89)(90)(91)(92)(93)(94)(95)(96)(97)(98)(99)(100)(101)(102)(103)(104)(105)(106)(107)(108)(109)(110)(111)(112)(113)(114)(115)(116)(117)(118)(119)(120)(121)(122)(123)(124)(125)(126)(127)(128)(129)(130)(131)(132)(133)(134)(135)(136)(137)(138)(139)(140)(141)(142)(143)(144)(145)(146)(147)(148)(149)(150)(151)(152)(153)(154)(155)(156)(157)(158)(159)(160)(161)(162)(163)(164)(165)(166)(167)(168)(169)(170)(171)(172)(173)(174)(175)(176)(177)(178)(179)(180)(181)(182)(183)(184)(185)(186)(187)(188)(189)(190)(191)(192)(193)(194)(195)(196)(197)(198)(199)(200)(201)(202)(203)(204)(205)(206)(207)(208)(209)(210)(211)(212)(213)(214)(215)(216)(217)(218)(219)(220)(221)(222)(223)(224)(225)(226)(227)(228)(229)(230)(231)(232)(233)(234)(235)(236)(237)(238)(239)(240)(241)(242)(243)(244)(245)(246)(247)(248)(249)(250)(251)(252)(253)(254)(255)(256)(257)(258)(259)(260)(261)(262)(263)(264)(265)(266)(267)(268)(269)(270)(271)(272)(273)(274)(275)(276)(277)(278)(279)(280)(281)(282)(283)(284)(285)(286)(287)(288)(289)(290)(291)(292)(293)(294)(295)(296)(297)(298)(299)(300)(301)(302)(303)(304)(305)(306)(307)(308)(309)(310)(311)(312)(313)(314)(315)(316)(317)(318)(319)(320)(321)(322)(323)(324)(325)(326)(327)(328)(329)(330)(331)(332)(333)(334)(335)(336)(337)(338)(339)(340)(341)(342)(343)(344)(345)(346)(347)(348)(349)(350)(351)(352)(353)(354)(355)(356)(357)(358)(359)(360)(361)(362)(363)(364)(365)(366)(367)(368)(369)(370)(371)(372)(373)(374)(375)(376)(377)(378)(379)(380)(381)(382)(383)(384)(385)(386)(387)(388)(389)(390)(391)(392)(393)(394)(395)(396)(397)(398)(399)(400)(401)(402)(403)(404)(405)(406)(407)(408)(409)(410)(411)(412)(413)(414)(415)(416)(417)(418)(419)(420)(421)(422)(423)(424)(425)(426)(427)(428)(429)(430)(431)(432)(433)(434)(435)(436)(437)(438)(439)(440)(441)(442)(443)(444)(445)(446)(447)(448)(449)(450)(451)(452)(453)(454)(455)(456)(457)(458)(459)(460)(461)(462)(463)(464)(465)(466)(467)(468)(469)(470)(471)(472)(473)(474)(475)(476)(477)(478)(479)(480)(481)(482)(483)(484)(485)(486)(487)(488)(489)(490)(491)(492)(493)(494)(495)(496)(497)(498)(499)(500)(501)(502)(503)(504)(505)(506)(507)libthread panic: cannot create new lwp (PID: 21935 LWP 2)
stacktrace:
ef76ec70
0
Exception in thread "main" java.net.SocketException: Too many open files
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:417)
at java.net.ServerSocket.implAccept(ServerSocket.java:245)
at java.net.ServerSocket.accept(ServerSocket.java:226)
at Threads.main(Threads.java:38)
java.net.SocketException: Too many open files
at java.net.PlainSocketImpl.socketCreate(Native Method)
at java.net.PlainSocketImpl.create(PlainSocketImpl.java:74)
at java.net.Socket.<init>(Socket.java:270)
at java.net.Socket.<init>(Socket.java:131)
at Threads.run(Threads.java:18)
!^C%
david.bowen@Eng 1999-09-29
- relates to
-
JDK-4335867 jdk1.2.2 RI crashes when the number of lwps is greater than 1024
-
- Closed
-