Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4806753

Using alternating charsets with String(byte[]) and String.getBytes is very slow

XMLWordPrintable

    • 1.4.1
    • 01
    • x86
    • windows_xp



        Name: nt126004 Date: 01/22/2003


        FULL PRODUCT VERSION :
        New slow version:
        java version "1.4.1_01"
        Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_01-b01)
        Java HotSpot(TM) Client VM (build 1.4.1_01-b01, mixed mode)

          Old fast version:
        java version "1.4.0"
        Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-b92)
        Java HotSpot(TM) Client VM (build 1.4.0-b92, mixed mode)

        FULL OPERATING SYSTEM VERSION :
        Microsoft Windows XP [Version 5.1.2600]

        EXTRA RELEVANT SYSTEM CONFIGURATION :
        Japanese System

        A DESCRIPTION OF THE PROBLEM :
        We just upgraded our JVM from version 1.4.0-b92 to
        1.4.1_01-b01 andencountered a severe performance degradation of our JSP
        application.
        Database Queries that had been taking around 200ms are now
        taking inthe range of 20-40 seconds! Making this JVM unusable for us.

        After doing quite a bit of debugging and digging into the
        source code, I believe the problem is being caused by the way
        java.lang.StringCoding is caching StringDecoder instances.

        We are using the mysql-connector-j-2.0.14-bin.jar driver to
        connect to a MySQL database. The records that are suffering the severe
        performance degradation contain both boolean and float values.

        Digging into the code I would say that the problem is being
        caused by the alternating calls to the StringCoding.decode where the
        charsetName is alternating between the value (Shift_JIS) set on our DB
        connection and the system default (MS932). The way that class is written,
        it does a very expensive lookup and then caches the StringDecoder for
        each thread until the required decoder changes. When the charset
        alternates. This results in an expensive lookup for every single call.

        I have not looked into how it was being done before.
        Looking around, I think this might be related to bug #4691554

        We are able to work around this on Japanese machines by
        setting the DB connection char set to MS932 so that it matches the system
        default. But this will not work for systems where the default is different.

        REGRESSION. Last worked in version 1.4

        STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
          To run the full example using the connection to the database
        1. java -classpath
        build/classes;lib/mysql-connector-j-2.0.14-bin.jar SlowMySQLTest
        2. On 1.4.0, you will see the following:
        ---
        Creating test table
        Creating test data...
        Done
        Test encoding: Shift_JIS
        Time = 611ms
        Test encoding: MS932
        Time = 170ms
        Test encoding: Shift_JIS
        Time = 100ms
        Test encoding: MS932
        Time = 131ms
        ---
        On 1.4.1 you will see the following:
        ---
        Creating test table
        Creating test data...
        Done
        Test encoding: Shift_JIS
        Time = 29001ms
        Test encoding: MS932
        Time = 250ms
        Test encoding: Shift_JIS
        Time = 23033ms
        Test encoding: MS932
        Time = 180ms
        ---

        You may also want to skip strait to the simplified example.
         1. java -classpath build/classes SlowTest
        2. On 1.4.0, you will see the following:
        ---
        Test encoding: Shift_JIS
        Time = 550ms
        Test encoding: MS932
        Time = 81ms
        Test encoding: Shift_JIS
        Time = 120ms
        Test encoding: MS932
        Time = 90ms
        ---
        On 1.4.1 you will see the following:
        ---
        Test encoding: Shift_JIS
        Time = 28781ms
        Test encoding: MS932
        Time = 110ms
        Test encoding: Shift_JIS
        Time = 23374ms
        Test encoding: MS932
        Time = 90ms
        ---


        REPRODUCIBILITY :
        This bug can be reproduced always.

        ---------- BEGIN SOURCE ----------
        SlowMySQLTest.java
        -------------------------------------------------
        import java.sql.Connection;
        import java.sql.Driver;
        import java.sql.ResultSet;
        import java.sql.SQLException;
        import java.sql.Statement;
        import java.util.Properties;

        public class SlowMySQLTest {
        public double getDouble( String str )
        {
        byte[] buf = str.getBytes();

        return Double.parseDouble( new String( buf ) );
        }

        /**
        * Create the test table and data.
        */
        public void setup()
        {
        try
        {
        System.out.println( "Creating test table" );
        Properties props = new Properties();
        Driver d = new com.mysql.jdbc.Driver();
        Connection c = d.connect(
        "jdbc:mysql://localhost/test?useUnicode=true&characterEncoding=Shift_JIS",
        props );
        Statement stmt = c.createStatement();
        stmt.executeUpdate( "DROP TABLE slow_test" );

        stmt.executeUpdate(
        "CREATE TABLE slow_test ( "
        + "id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, "
        + "flt FLOAT NOT NULL, flag CHAR(1) NOT NULL )" );

        System.out.println( "Creating test data..." );
        for ( int i = 0; i < 2000; i++ )
        {
        stmt.executeUpdate( "INSERT INTO slow_test VALUES ( NULL, " + i + ", 'y')" );
        }
        System.out.println( "Done" );

        stmt.close();
        c.close();
        }
        catch ( SQLException e )
        {
        e.printStackTrace();
        }
        }

        public void test( String encoding )
        {
        System.out.println( "Test encoding: " + encoding );
        long start = System.currentTimeMillis();

        try
        {
        Properties props = new Properties();
        Driver d = new com.mysql.jdbc.Driver();
        Connection c = d.connect(
        "jdbc:mysql://localhost/test?useUnicode=true&characterEncoding="
        + encoding, props );
        Statement stmt = c.createStatement();
        ResultSet rs = stmt.executeQuery( "SELECT id, flt, flag FROM slow_test" );
        while ( rs.next() )
        {
        int id = rs.getInt( 1 );
        float flt = rs.getFloat( 2 );
        boolean flag = rs.getBoolean( 3 );
        }

        stmt.close();
        c.close();
        }
        catch ( SQLException e )
        {
        e.printStackTrace();
        }

        System.out.println( "Time = " + ( System.currentTimeMillis() - start ) + "ms" );
        }

        public static void main( String[] args ) {
        SlowMySQLTest slowTest = new SlowMySQLTest();
        slowTest.setup();
        slowTest.test( "Shift_JIS" );
        slowTest.test( "MS932" );
        slowTest.test( "Shift_JIS" );
        slowTest.test( "MS932" );
        }
        }
        -------------------------------------------------

        SlowTest.java
        -------------------------------------------------
        public class SlowTest {
        public double getDouble( String str )
        {
        byte[] buf = str.getBytes();

        return Double.parseDouble( new String( buf ) );
        }


        /*
        * The contents of this method are result of narrowing down the
        * the base cause of the code in SlowMySQLTest.test().
        */
        public void test( String encoding )
        {
        System.out.println( "Test encoding: " + encoding );
        long start = System.currentTimeMillis();

        for ( int i = 0; i < 2000; i++ ) {
        // rs.getFloat( 2 );
        double d = Double.parseDouble( new String( "1".getBytes() ) );

        // rs.getBoolean( 3 );
        try
        {
        new String( "true".getBytes(), encoding );
        }
        catch (java.io.UnsupportedEncodingException e)
        {
        e.printStackTrace();
        }
        }

        System.out.println( "Time = " + ( System.currentTimeMillis() - start ) + "ms" );
        }

        public static void main( String[] args ) {
        SlowTest slowTest = new SlowTest();
        slowTest.test( "Shift_JIS" );
        slowTest.test( "MS932" );
        slowTest.test( "Shift_JIS" );
        slowTest.test( "MS932" );
        }
        }
        -------------------------------------------------

        ---------- END SOURCE ----------

        CUSTOMER WORKAROUND :
        If we make a call to
        sun.io.Converters.getDefaultEncodingName() on our machine
        then the current encoding returns as "MS932". The
        configuration for the database connection normally uses
        "Shift_JIS". By replacing this value so it matches the
        default value, we are able to work around the problem.

        The problem is that this ONLY works on Japanese machines.
        If the defaule encoding returned is not MS932 then this
        workaround will no longer work.

        Unfortunately, we will have to stay with the older version
        of the JVM for our customers until this has been resolved.

        Release Regression From : 1.4
        The above release value was the last known release where this
        bug was known to work. Since then there has been a regression.

        (Review ID: 180029)
        ======================================================================

              martin Martin Buchholz
              nthompsosunw Nathanael Thompson (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: