The input string is mix of some text with valid JSON:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<TITLE>Title</TITLE>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<META HTTP-EQUIV="Content-language" CONTENT="en">
<META HTTP-EQUIV="keywords" CONTENT="search words">
<META HTTP-EQUIV="Expires" CONTENT="0">
<script SRC="include/datepicker.js" LANGUAGE="JavaScript" TYPE="text/javascript"></script>
<script SRC="include/jsfunctions.js" LANGUAGE="JavaScript" TYPE="text/javascript"></script>
<link REL="stylesheet" TYPE="text/css" HREF="css/datepicker.css">
<script language="javascript" type="text/javascript">
function limitText(limitField, limitCount, limitNum) {
if (limitField.value.length > limitNum) {
limitField.value = limitField.value.substring(0, limitNum);
} else {
limitCount.value = limitNum - limitField.value.length;
}
}
</script>
{"List":[{"ID":"175114","Number":"28992"]}
The task is to deserialize the JSON part of it into some object. The string can begin with some text, but it surely contains the valid JSON. I've tried to use JSON validation REGEX, but there was a problem parsing such pattern in .NET.
So in the end I'd wanted to get only:
{
"List": [{
"ID": "175114",
"Number": "28992"
}]
}
Clarification 1:
There is only single JSON object in whole the messy string, but the text can contain {}(its actually HTML and can contain javascripts with <script> function(){.....
)
You can use this method
public object ExtractJsonObject(string mixedString)
{
for (var i = mixedString.IndexOf('{'); i > -1; i = mixedString.IndexOf('{', i + 1))
{
for (var j = mixedString.LastIndexOf('}'); j > -1; j = mixedString.LastIndexOf("}", j -1))
{
var jsonProbe = mixedString.Substring(i, j - i + 1);
try
{
return JsonConvert.DeserializeObject(jsonProbe);
}
catch
{
}
}
}
return null;
}
The key idea is to search all { and } pairs and probe them, if they contain valid JSON. The first valid JSON occurrence is converted to an object and returned.
Use regex to find all possible JSON structures:
\{(.|\s)*\}
Regex example
Then iterate all these matches unitil you find a match that will not cause an exception:
JsonConvert.SerializeObject(match);
If you know the format of the JSON structure, use JsonSchema.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With