Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how can I convert xsd: pattern in java regex

Tags:

java

regex

xsd

As I know, and I used very little java regex, there is a method (or tool) to convert a control xsd:pattern in java regex?

My xsd: pattern is as follows:

<xsd:simpleType name="myCodex">
<xsd:restriction base="xsd:string">
 <xsd:pattern value="[A-Za-z]{6}[0-9]{2}[A-Za-z]{1}[0-9]{2}[A-Za-z]{1}[0-9A-Za-z]{3}[A-Za-z]{1}" />
 <xsd:pattern value="[A-Za-z]{6}[0-9LMNPQRSTUV]{2}[A-Za-z]{1}[0-9LMNPQRSTUV]{2}[A-Za-z]{1}[0-9LMNPQRSTUV]{3}[A-Za-z]{1}" />
 <xsd:pattern value="[0-9]{11,11}" />
</xsd:restriction>
</xsd:simpleType>
like image 557
user3065205 Avatar asked Nov 02 '22 01:11

user3065205


1 Answers

You can load the XSD into Java and extract the expressions. Then you can use them in .matches() methods or create Pattern objects if you are going to reuse them a lot.

First you need to load the XML into a Java program (I called it CodexSchema.xsd):

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document source = builder.parse(new File("CodexSchema.xsd"));

Then you can use XPath to find the patterns you want to extract (you might want to create a method that takes the name of the simple type, if you have many to process). I used a more complicated XPath expression to avoid registering the namespaces:

XPathFactory xPathfactory = XPathFactory.newInstance();
String typeName = "myCodex";
String xPathRoot = "//*[local-name()='simpleType'][@name='"+typeName+"']/*[local-name()='restriction']/*[local-name()='pattern']";
XPath patternsXPath = xPathfactory.newXPath(); // this represents the NodeList of <xs:pattern> elements

Running that expression you get org.xml.dom.NodeList containing the <xs:pattern> elements.

NodeList patternNodes = (NodeList)patternsXPath.evaluate(xPathRoot, source, XPathConstants.NODESET);

Now you can loop through them and extract the contents of their value attribute. You might want to write a method for that:

public List<Pattern> getPatterns(NodeList patternNodes) {
    List<Pattern> expressions = new ArrayList<>();
    for(int i = 0; i < patternNodes.getLength(); i++) {
        Element patternNode = (Element)patternNodes.item(i);
        String regex = patternNode.getAttribute("value");
        expressions.add(Pattern.compile(regex));
    }
    return expressions;
}

You don't really need to put them into Pattern. You could simply use String.

You can now read all your patterns in Java using:

for(Pattern p : getPatterns(patternNodes)) {
    System.out.println(p);
}

Here are some tests with the third pattern:

Pattern pattern3 = getPatterns(patternNodes).get(2);

Matcher matcher = pattern3.matcher("47385628403");
System.out.println("test1: " + matcher.find());  // prints `test1: true`

System.out.println("test2: " + "47385628403".matches(pattern3.toString()));  // prints `test2: true`
like image 190
helderdarocha Avatar answered Nov 09 '22 09:11

helderdarocha