-
Bug
-
Resolution: Not an Issue
-
P3
-
None
-
7u21
-
None
-
javac 1.7.0_21 Ubuntu 12.04.2
You can try this usecase with:
git clone https://github.com/mperdikeas/jaxp-validation-space-around-token.git && cd jaxp-validation-space-around-token && ant
The code executes and validates the a.xml instance document although, given the A.xsd grammer, it should have complained due to the extra whitespace.
This was originally reported to JAXB's bugtracker:
https://java.net/jira/browse/JAXB-964
But the engineer responsible suggested that I report it to JAXP instead.
Crux of the matter is that I have a type defined as a token in XSD and in an instance document an enumerated value appears with whitespace around it and the validation doesn't complain although it should:
From XML Schema Part 2: Datatypes Second Edition, section 3.3.2, token:
[Definition:] token represents tokenized strings. The - value space- of token is the set of strings that do not contain the carriage return (#xD), line feed (#xA) nor tab (#x9) characters, that have no leading or trailing spaces (#x20) and that have no internal sequences of two or more spaces.
Files follow:
A.xsd
<xs:schema targetNamespace="foo://a"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns="foo://a">
<xs:element name="type" type="Type"/>
<xs:simpleType name="Type">
<xs:restriction base="xs:token">
<xs:enumeration value="Archive"/>
<xs:enumeration value="Organisation"/>
</xs:restriction>
</xs:simpleType>
</xs:schema>
a.xml
<a:type xmlns:a="foo://a" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="foo://a A.xsd"
>Organisation </a:type>
(notice the space after 'Organization')
Java validating code
import java.io.IOException;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.InputStream;
import java.io.ByteArrayInputStream;
import java.io.File;
import javax.xml.XMLConstants;
import org.xml.sax.SAXException;
import org.xml.sax.InputSource;
import javax.xml.transform.sax.SAXSource;
import javax.xml.validation.Validator;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Schema;
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamSource;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.nio.charset.StandardCharsets;
import java.nio.ByteBuffer;
public class FooMain {
public static void main(String args[]) throws Exception {
SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema SCHEMA = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI).newSchema( new StreamSource(new File("A.xsd")));
Validator validator = SCHEMA.newValidator();
SAXSource source = new SAXSource(new InputSource(new ByteArrayInputStream(
StandardCharsets.UTF_8.decode(ByteBuffer.wrap(Files.readAllBytes(Paths.get("a.xml")))).toString().getBytes())));
try
{ validator.validate(source); System.out.println("validates"); }
catch (SAXException e)
{ System.out.println("doesn't validate"); }
}
}
Comments:
Franta Mejta added a comment - 04/Oct/13 5:51 AM
The value of "Organisation " is completely ok for xsd:token. See http://www.xmlplease.com/normalized.
ankur1986 added a comment - 15/Jul/15 6:04 AM
Hi, just came across this while encountering a similar issue.
javax.xml.validation.Validator fails to complain while comparing a 'space' in the schema against a 'blank' in the response.
<xs:simpleType name="sign">
<xs:restriction base="xs:token">
<xs:enumeration value=" " />
<xs:enumeration value="+ " />
<xs:enumeration value="-" />
</xs:restriction>
</xs:simpleType>
if the xml response has a response containing an attribute modeled as sign as below
<tag attribute_sign="">
we would expect the validation to fail as 'blank' is not in the list of valid values.
Further, if you play around in the schema changing it to have " + " (added spaces) and the xml response as
<tag attribute_sign="+">
This still validates true.
Can we switch to any more strict validator implementation or have any options to fail this ?
git clone https://github.com/mperdikeas/jaxp-validation-space-around-token.git && cd jaxp-validation-space-around-token && ant
The code executes and validates the a.xml instance document although, given the A.xsd grammer, it should have complained due to the extra whitespace.
This was originally reported to JAXB's bugtracker:
https://java.net/jira/browse/JAXB-964
But the engineer responsible suggested that I report it to JAXP instead.
Crux of the matter is that I have a type defined as a token in XSD and in an instance document an enumerated value appears with whitespace around it and the validation doesn't complain although it should:
From XML Schema Part 2: Datatypes Second Edition, section 3.3.2, token:
[Definition:] token represents tokenized strings. The - value space- of token is the set of strings that do not contain the carriage return (#xD), line feed (#xA) nor tab (#x9) characters, that have no leading or trailing spaces (#x20) and that have no internal sequences of two or more spaces.
Files follow:
A.xsd
<xs:schema targetNamespace="foo://a"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns="foo://a">
<xs:element name="type" type="Type"/>
<xs:simpleType name="Type">
<xs:restriction base="xs:token">
<xs:enumeration value="Archive"/>
<xs:enumeration value="Organisation"/>
</xs:restriction>
</xs:simpleType>
</xs:schema>
a.xml
<a:type xmlns:a="foo://a" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="foo://a A.xsd"
>Organisation </a:type>
(notice the space after 'Organization')
Java validating code
import java.io.IOException;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.InputStream;
import java.io.ByteArrayInputStream;
import java.io.File;
import javax.xml.XMLConstants;
import org.xml.sax.SAXException;
import org.xml.sax.InputSource;
import javax.xml.transform.sax.SAXSource;
import javax.xml.validation.Validator;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Schema;
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamSource;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.nio.charset.StandardCharsets;
import java.nio.ByteBuffer;
public class FooMain {
public static void main(String args[]) throws Exception {
SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema SCHEMA = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI).newSchema( new StreamSource(new File("A.xsd")));
Validator validator = SCHEMA.newValidator();
SAXSource source = new SAXSource(new InputSource(new ByteArrayInputStream(
StandardCharsets.UTF_8.decode(ByteBuffer.wrap(Files.readAllBytes(Paths.get("a.xml")))).toString().getBytes())));
try
{ validator.validate(source); System.out.println("validates"); }
catch (SAXException e)
{ System.out.println("doesn't validate"); }
}
}
Comments:
Franta Mejta added a comment - 04/Oct/13 5:51 AM
The value of "Organisation " is completely ok for xsd:token. See http://www.xmlplease.com/normalized.
ankur1986 added a comment - 15/Jul/15 6:04 AM
Hi, just came across this while encountering a similar issue.
javax.xml.validation.Validator fails to complain while comparing a 'space' in the schema against a 'blank' in the response.
<xs:simpleType name="sign">
<xs:restriction base="xs:token">
<xs:enumeration value=" " />
<xs:enumeration value="+ " />
<xs:enumeration value="-" />
</xs:restriction>
</xs:simpleType>
if the xml response has a response containing an attribute modeled as sign as below
<tag attribute_sign="">
we would expect the validation to fail as 'blank' is not in the list of valid values.
Further, if you play around in the schema changing it to have " + " (added spaces) and the xml response as
<tag attribute_sign="+">
This still validates true.
Can we switch to any more strict validator implementation or have any options to fail this ?