Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is SQL code faster than C# code? [closed]

Tags:

c#

sql

sql-server

Few months ago i started to work at this programming company. One of the practices they use is to do as much work as possible in SQL rather than C#.

So, lets say i have this simple example of writing a list of some files:

Is something like this:

string SQL = @"
    SELECT f.FileID,
           f.FileName,
           f.FileExtension,
           '/files/' + CAST(u.UserGuid AS VARCHAR(MAX)) + '/' + (f.FileName + f.FileExtension) AS FileSrc,
           FileSize=
           CASE
               WHEN f.FileSizeB < 1048576 THEN CAST(CAST((f.FileSizeB / 1024) AS DECIMAL(6, 2)) AS VARCHAR(8)) + ' KB'
               ELSE CAST(CAST((f.FileSizeB / 1048576) AS DECIMAL(6, 2)) AS VARCHAR(8)) + ' MB'
           END
      FROM Files f
INNER JOIN Users u
        ON f.UserID = u.UserID
";

// some loop for writing results {
//     write...
// }

Faster or better then something like this:

string SQL = @"
    SELECT u.UserGuid,
           f.FileID,
           f.FileName,
           f.FileExtension,
           f.FileSizeB
      FROM Files f
INNER JOIN Users u
        ON f.UserID = u.UserID";

// some loop for writing results {
       string FileSrc = "/Files/" + result["UserGuid"] + "/" + result["FileName"] + result["FileExtension"];
       string FileSize = ConvertToKbOrMb(result["FileSizeB"]);  
//     write...
// }

This particular code doesn't matter (it's just some basic example) ... the question is about this kind of thing in general ... is it better to put more load on SQL or 'normal' code?

like image 562
Draško Avatar asked Feb 23 '14 21:02

Draško


People also ask

Is anything faster than C?

Judging the performance of programming languages, usually C is called the leader, though Fortran is often faster. New programming languages commonly use C as their reference and they are really proud to be only so much slower than C.

Is there something faster than SQL?

As for speed, NoSQL is generally faster than SQL, especially for key-value storage in our experiment; On the other hand, NoSQL database may not fully support ACID transactions, which may result data inconsistency.

Is SQL a slow language?

The main language used to communicate with different databases is the Structured Query Language (SQL). A slow SQL application can cost your company more than money and time, but cause customer dissatisfaction and lower sales as well. There are a myriad of reasons why your SQL application runs slow.

Is SQL the same as C?

So in summary, C is a language used to give commonly-understood commands to any arbitrary CPU while SQL is a language used to give commonly-understood commands to any arbitrary database back-end.


5 Answers

It just a bad programming practice. You should separate and isolate different parts of your program for ease of future maintenance (think of the next programmer!)

Performance

Many solutions suffer from poor DB performance, so most developers usually limit the SQL database access to the smallest transaction possible. Ideally the transformation of raw data to human readable form should happen at the very last point possible. Also the memory usage of non-formatted data is much smaller, and while memory is cheap, you shouldn't waste it. Every extra byte to be buffered, cached, and transmitted all takes up time, and reduces available server resources

e.g. for a web application formatting should be done by local JavaScript templates from a JSON data packet. This reduces the workload of the backend SQL database and application servers, and reduces the data that needs to be transmitted over the network, all of which speeds up server performance

Formatting and Localisation

Many solutions have different output needs for the same transaction e.g. different views, different localisations etc. By embedding formating into the SQL transaction you will have to make a new transaction for each localisation, this will be become a maintenance nightmare

Also formatted transactions cannot be used for an API interface, you would need yet another set of transaction for the API interface which would have no formatting

With c# you should be using a well tested template or string handling library, or at least string.Format(), do not use the '+' operator with strings, it is very slow

Share the load

Most solutions have multiple clients for one DB, so the client side formatting load is shared with the multiple clients CPU's, not the single SQL database CPU

I seriously doubt SQL is faster than c#, you should perform a simple benchmark and post the results here :-)

like image 126
TFD Avatar answered Oct 24 '22 09:10

TFD


The reason that the second part it may be little slower is because you need to pull out the data from the SQL server and gives it to C# part of code, and this takes more time.

The more read you make like ConvertToKbOrMb(result["FileSizeB"]) can always take some more time, and also depend from your DAL layer. I have see some DALs that are really slow.

If you left them on the SQL Server you gain this extra processing of getting out the data, thats all.

From experience, one of my optimizations is always to pull out only the needed data - The more data you read from the sql server and move it to whatever (asp.net, console, c# program etc), the more time you spend to move them around, especial if they are big strings, or make a lot of conversions from string to numbers.

To answer and to the direct question, what is faster - I say that you can not compare them. They are both as fast as possible, if you make good code and good queries. SQL Server also keeps a lot of statistics and improve the return query - c# did not have this kind of part, so what to compare ?

One test by my self

Ok, I have here a lot of data from a project, and make a fast test that actually not prove that the one is faster than the other.

What I run two cases.

SELECT TOP 100 PERCENT cI1,cI2,cI3 
  FROM [dbo].[ARL_Mesur] WITH (NOLOCK)  WHERE [dbo].[ARL_Mesur].[cWhen] > @cWhen0;

        foreach (var Ena in cAllOfThem)
        {
            // this is the line that I move inside SQL server to see what change on speed
            var results = Ena.CI1 + Ena.CI2 + Ena.CI3;

            sbRender.Append(results);
            sbRender.Append(Ena.CI2);
            sbRender.Append(Ena.CI3);
        }

vs

SELECT TOP 100 PERCENT (cI1+cI2+cI3) as cI1,cI2,cI3 
   FROM [dbo].[ARL_Mesur] WITH (NOLOCK)  WHERE [dbo].[ARL_Mesur].[cWhen] > @cWhen0;


        foreach (var Ena in cAllOfThem)
        {
            sbRender.Append(Ena.CI1);
            sbRender.Append(Ena.CI2);
            sbRender.Append(Ena.CI3);
        }

and the results shows that the speed is near the same. - All the parameters are double - The reads are optimized, I make no extra reads at all, just move the processing from the one part to the other.

On 165,766 lines, here are some results:

Start  0ms  +0ms
 c# processing  2005ms  +2005ms
sql processing  4011ms  +2006ms


Start  0ms  +0ms
 c# processing  2247ms  +2247ms
sql processing  4514ms  +2267ms


Start  0ms  +0ms
 c# processing  2018ms  +2018ms
sql processing  3946ms  +1928ms

Start  0ms  +0ms
c# processing  2043ms  +2043ms
sql processing  4133ms  +2090ms

So, the speed can be affected from many factors... we do not know what is your company issue that makes the c# slower than the sql processing.

like image 22
Aristos Avatar answered Oct 24 '22 10:10

Aristos


As a general rule of thumb: SQL is for manipulating data, not formatting how it is displayed.

Do as much as you can in SQL, yes, but only as long as it serves that goal. I'd take a long hard look at your "SQL example", solely on that ground. Your "C# example" looks like a cleaner separation of responsibilities to me.

That being said, please don't take it too far and stop doing things in SQL that should be done in SQL, such as filtering and joining. For example reimplementing INNER JOIN Users u ON f.UserID = u.UserID in C# would be a catastrophe, performance-wise.


As for performance in this particular case:

I'd expect "C# example" (not all C#, just this example) to be slightly faster, simply because...

    f.FileSizeB

...looks narrower than...

   '/files/' + CAST(u.UserGuid AS VARCHAR(MAX)) + '/' + (f.FileName + f.FileExtension) AS FileSrc,
   FileSize=
   CASE
       WHEN f.FileSizeB < 1048576 THEN CAST(CAST((f.FileSizeB / 1024) AS DECIMAL(6, 2)) AS VARCHAR(8)) + ' KB'
       ELSE CAST(CAST((f.FileSizeB / 1048576) AS DECIMAL(6, 2)) AS VARCHAR(8)) + ' MB'
   END

...which should conserve some network bandwidth. And network bandwidth tends to be scarcer resource than CPU (especially client-side CPU).

Of course, your mileage may vary, but either way the performance difference is likely to be small enough, so other concerns, such as the overall maintainability of code, become relatively more important. Frankly, your "C# example" looks better to me here, in that regard.

like image 8
Branko Dimitrijevic Avatar answered Oct 24 '22 10:10

Branko Dimitrijevic


There are good reasons to do as much as you can on the database server. Minimizing the amount of data that has to be passed back and forth and giving the server as much leeway in optimizing the process is a good thing.

However that is not really illustrated in your example. Both processes pass as much data back and forth (perhaps the first passes more) and the only difference is who does the calculation and it may be that the client does this better.

like image 5
albe Avatar answered Oct 24 '22 10:10

albe


Your question is about whether the string manipulation operations should be done in C# or SQL. I would argue that this example is so small that any performance gain -- one-way or the other -- is irrelevant. The question is "where should this be done"?

If the code is "one-off" code for part of an application, then doing in the application level makes a lot of sense. If this code is repeated throughout the application, then you want to encapsulate it. I would argue that the best way to encapsulate it is using a SQL Server computed column, view, table-valued function, or scalar function (with the computed column being preferable in this case). This ensures that the same processing occurs the same no matter where called.

There is a key difference between database code and C# code in terms of performance. The database code automatically runs in parallel. So, if your database server is multi-threaded, then separate threads might be doing those string manipulations at the same time (no promises, the key word here is "might").

In general when thinking about the split, you want to minimize the amount of data being passed back and forth. The difference in this case seems to be minimal.

So, if this is one place in an application that has this logic, then do it in the application. If the application is filled with references to this table that want this logic, then think about a computed column. If the application has lots of similar requests on different tables, then think about a scalar valued function, although this might affect the ability of queries to take advantage of parallelism.

like image 5
Gordon Linoff Avatar answered Oct 24 '22 10:10

Gordon Linoff