Loading...

XML

Word

Printable

Type: Enhancement
Resolution: Unresolved
Priority: P4
Fix Version/s: tbd
Affects Version/s: 15
Component/s: core-libs
Labels:

Subcomponent:
java.sql

A DESCRIPTION OF THE PROBLEM :
The getExtraNameCharacters() method of DatabaseMetaData is currently defined as:

> Retrieves all the "extra" characters that can be used in unquoted identifier names (those beyond a-z, A-Z, 0-9 and _).

That is not exactly how identifiers work, at least not for any system that follows the Unicode specification for identifiers. According to "UnicodeÃ‚Â® Standard Annex #31: UNICODE IDENTIFIER AND PATTERN SYNTAX" ( https://www.unicode.org/reports/tr31/#Default_Identifier_Syntax ) there are two categories of characters: Start and Continue.

* "Start" characters (having the Unicode properties of id_start / xid_start) are valid in any position.
* "Continue" characters (having the Unicode properties of id_continue / xid_continue) are valid in any position _except_ the first character.

So, all "start" characters are also "continue" characters, but _not_ all "continue" characters are "start" characters. For example, in Microsoft SQL Server (and likely some others), it is valid to have decimal digits 0-9 in the name, _but_ the name cannot _start_ with a digit. In the following example, the first statement succeeds but the second statement fails:

    USE [tempdb];
    CREATE TABLE dbo.A1 (col INT);
    GO
    CREATE TABLE dbo.1A (col INT);
    GO
    /*
    Msg 102, Level 15, State 1, Line XXXXX
    Incorrect syntax near '.1'.
    */

This is an important distinction because in its current form, getExtraNameCharacters() returns potentially incorrect information. For example, in SQL Server this method ( https://github.com/microsoft/mssql-jdbc/blob/dev/src/main/java/com/microsoft/sqlserver/jdbc/SQLServerDatabaseMetaData.java ) returns:

$#@

If someone is relying on this method to get those "extra" characters, it certainly appears to be saying that "$" is valid. Well, the following example works like the first example in that the first statement succeeds while the second statement fails:

    USE [tempdb];
    CREATE TABLE dbo.Bob$ (col INT);
    GO
    CREATE TABLE dbo.$Bob (col INT);
    GO
    /*
    Msg 102, Level 15, State 1, Line XXXXX
    Incorrect syntax near '$Bob'.
    */

Hence, the getExtraNameCharacters method needs to be replaced with two methods:

* getExtraNameStartCharacters
* getExtraNameContinueCharacters

To see the full list of characters that SQL Server should be returning for each of those two methods, please see:

https://sqlquantumleap.com/reference/completely-complete-list-of-valid-t-sql-identifier-characters/

For systems that do not follow the Unicode specification and only have a single set of characters, those two functions could simply return the same value.

Take care,
Solomon...
https://SqlQuantumLift.com/
https://SqlQuantumLeap.com/
https://SQLsharp.com/

FREQUENCY : always

Assignee:: Lance Andersen

Reporter:: Webbug Group

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2020-03-30 22:55

Updated:: 2023-03-06 11:54

Details

Description

Attachments

Activity

People

Dates