Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use SQL to return a JSON string

This is a "best practice" question. We are having internal discussions on this topic and want to get input from a wider audience.

I need to store my data in a traditional MS SQL Server table with normal columns and rows. I sometimes need to return a DataTable to my web application, and other times I need to return a JSON string.

Currently, I return the table to the middle layer and parse it into a JSON string. This seems to work well for the most part, but does occasionally take a while on large datasets (parsing the data, not returning the table).

I am considering revising the stored procedures to selectively return a DataTable or a JSON string. I would simply add a @isJson bit parameter to the SP.

If the user wanted the string instead of the table the SP would execute a query like this:

DECLARE @result varchar(MAX)
SELECT @result = COALESCE(@results ',', '') + '{id:"' + colId + '",name:"' + colName + '"}'
    FROM MyTable
SELECT @result

This produces something like the following:

{id:"1342",name:"row1"},{id:"3424",name:"row2"}

Of course, the user can also get the table by passing false to the @isJson parameter.

I want to be clear that the data storage isn't affected, nor are any of the existing views and other processes. This is a change to ONLY the results of some stored procedures.

My questions are:

  1. Has anyone tried this in a large application? If so, what was the result?
  2. What issues have you seen/would you expect with this approach?
  3. Is there a better faster way to go from table to JSON in SQL Server other than modifying the stored procedure in this way or parsing the string in the middle tier?
like image 413
davids Avatar asked Jul 16 '12 20:07

davids


2 Answers

I personally think the best place for this kind of string manipulation is in program code in a fully expressive language that has functions and can be compiled. Doing this in T-SQL is not good. Program code can have fast functions that do proper escaping.

Let's think about things a bit:

  • When you deploy new versions of the parts and pieces of your application, where is the best place for this functionality to be?

  • If you have to restore your database (and all its stored procedures) will that negatively affect anything? If you are deploying a new version of your web front end, will the JSON conversion being tied into the database cause problems?

  • How will you escape characters properly? Are you sending any dates through? What format will date strings be in and how will they get converted to actual Date objects on the other end (if that is needed)?

  • How will you unit test it (and with automated tests!) to prove it is working correctly? How will you regression test it?

  • SQL Server UDFs can be very slow. Are you content to use a slow function, or for speed hack into your SQL code things like Replace(Replace(Replace(Replace(Value, '\', '\\'), '"', '\"'), '''', '\'''), Char(13), '\n')? What about Unicode, \u and \x escaping? How about splitting '</script>' into '<' + '/script>'? (Maybe that doesn't apply, but maybe it does, depending on how you use your JSON.) Is your T-SQL procedure going to do all this, and be reusable for different recordsets, or will you rewrite it each time into each SP that you need to return JSON?

  • You may only have one SP that needs to return JSON. For now. Some day, you might have more. Then if you find a bug, you have to fix it in two places. Or five. Or more.

It may seem like you are making things more complicated by having the middle layer do the translation, but I promise you it is going to be better in the long run. What if your product scales out and starts going massively parallel—you can always throw more web servers at it cheaply, but you can't so easily fix database server resource saturation! So don't make the DB do more work than it should. It is a data access layer, not a presentation layer. Make it do the minimum amount of work possible. Write code for everything else. You will be glad you did.

Speed Tips for String Handling in a Web Application

  1. Make sure your web string concatenation code doesn't suffer from Schlemiel the Painter's Algorithm. Either directly write to the output buffer as JSON is generated (Response.Write), or use a proper StringBuilder object, or write the parts of the JSON to an array and Join() it later. Don't do plain vanilla concatenation to a longer and longer string over and over.
  2. Dereference objects as little as possible. I don't know your server-side language, but if it happens to be ASP Classic, don't use field names--either get a reference to each field in a variable or at the very least use integer field indexes. Dereferencing a field based on its name inside a loop is (much) worse performance.
  3. Use pre-built libraries. Don't roll your own when you can use a tried and true library. Performance should be equal or better to your own and (most importantly) it will be tested and correct.
  4. If you're going to spend the time doing this, make it abstract enough to handle converting any recordset, not just the one you have now.
  5. Use compiled code. You can always get the fastest code when it is compiled, not interpreted. If you identify that the JSON-conversion routines are truly the bottleneck (and you MUST prove this for real, do not guess) then get the code into something that is compiled.
  6. Reduce string lengths. This is not a big one, but if at all possible use one-letter json names instead of many-letter. For a giant recordset this will add up to savings on both ends.
  7. Ensure it is GZipped. This is not so much a server-side improvement, but I couldn't mention JSON performance without being complete.

Passing Dates in JSON

What I recommend is to use a separate JSON schema (itself in JSON, defining the structure of the virtual recordset to follow). This schema can be sent as a header to the "recordset" to follow, or it can be already loaded in the page (included in the base javascript files) so it doesn't have to be sent each time. Then, in your JSON parse callback (or post-callback on the final resultant object) look in the schema for the current column and do conversions as necessary. You might consider using ISO format since in ECMAScript 5 strict mode there is supposed to be better date support and your code can be simplified without having to change the data format (and a simple object detect can let you use this code for any browser that supports it):

Date

Dates are now capable of both parsing and outputting ISO-formatted dates.

The Date constructor now attempts to parse the date as if it was ISO-formatted, first, then moves on to the other inputs that it accepts.

Additionally, date objects now have a new .toISOString() method that outputs the date in an ISO format. var date = new Date("2009-05-21T16:06:05.000Z");

print( date.toISOString() ); // 2009-05-21T16:06:05.000Z

like image 79
ErikE Avatar answered Nov 06 '22 17:11

ErikE


I wouldn't do that way you are doing (contatenating)

You can try creating a CLR SQL function that uses JSON.net and returns a varchar.

See here how to create SQL CLR Functions: http://msdn.microsoft.com/en-us/library/w2kae45k(v=vs.80).aspx

Something like this (untested code)

[Microsoft.SqlServer.Server.SqlFunction]
public static SqlString MyFunctionName(int id) {
    // Put your code here (maybe find the object you want to serialize using the id passed?)
    using (var cn = new SqlConnection("context connection=true") ) {
        //get your data into an object
        var myObject = new {Name = "My Name"};
        return new SqlString(Newtonsoft.Json.JsonConvert.SerializeObject(myObject));
    }
}
like image 31
turtlepick Avatar answered Nov 06 '22 17:11

turtlepick