Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is my coldfusion code 100x faster now on string processing?

So, as my title says, I have a ColdFusion program that used to take 10 minutes to run on our server but now runs in <15 seconds. Being confused why such a simple program was taking 10 minutes me and my boss examined it to figure out the culprit section of code causing the slow down. We ended up making it go from 10 minutes to 5-10 seconds to run.

Now we're not sure why the fix is a fix, so we were wanting to see if anyone could explain to us why/how this works so that we may understand the fix so we might be able to utilize the speed up within other programs.

The beginning of this program is a query that grabs ~4800 records (nothing outrageous), We then loop through these records, which we figured out was the slow section. Here's a rough example of what we had, and what we did to fix it. TextString is set at the top near the query to headers of the fields we're returning .

Old Code:

<cfloop>
       <CFSET TextString = TextString & DriverID & TabChar & LocalSSN & TabChar & FirstName & CarriageReturn & LineFeed>        
</cfloop>

Fixed Code:

<cfloop>
    <CFSET LocalTextString = "">
    <CFSET LocalTextString = LocalTextString & DriverID & TabChar & LocalSSN & TabChar & FirstName & CarriageReturn & LineFeed>
    <CFSET TextString = TextString & LocalTextString>
</cfloop>
like image 856
Ryan Kelso Avatar asked Dec 20 '13 16:12

Ryan Kelso


1 Answers

Your code is almost certainly faster because of the way Strings are concatenated in CF. Although I don't know the exact internals of CF, I suspect treat Strings as immutable. That means each time you concatenate an extra variable onto the String using &, it'll create a new String, containing both the old String and the new string on the end. To do that, it'll have to allocate memory, and as the string grows, that's more and more memory.

There are 8 variables being added to the string each go round the loop including the previous version of the String, so you're allocating ~4800*8 Strings during the loop.

Assuming each row is 35 characters long, the final size of the string will be 168k. That means its average size during the run is half that: 84k. Now, bearing mind that you're allocating the string 4800*8 times, you're using as much as 3 GIG (4800*8*8400)of memory to create 168k of output. That'll mean Java has to do a whole heap of Garbage collection and extra work to service your code.

Your updated code is working on LocalTextString 7 times out of 8, which will be tiny in comparison, so you'll get a considerable speed improvement.

Try this version:

<cfset buffer=ArrayNew()>
<cfset crlf=CarriageReturn & LineFeed />
<cfloop>
  <cfset ArrayAppend(buffer,DriverID)>
  <cfset ArrayAppend(buffer,TabChar)>
  <cfset ArrayAppend(buffer,LocalSSN)>
  <cfset ArrayAppend(buffer,TabChar)> 
  <cfset ArrayAppend(buffer,FirstName)>
  <cfset ArrayAppend(buffer,crlf)> 
</cfloop>
<cfoutput arrayToList(buffer, "")/>

It builds up an array of Strings, then turns them into one String at the end. You can also look at StringBuffer from Java, which does the same thing. When I last looked at it, the Array/List method above was fastest, but that was a few versions of CF back.

Update

I wasn't happy with the number of guesses in my answer, so I had a pop at reproducing the problem. I used a 5-year old Mac with CF10 for testing.

  • Original Code: 8100ms
  • Improved Original Code: 750ms
  • ArrayAppend method: ~15ms

If I hook VisualVM onto the ColdFusion process whilst it's running the tests, I can see that the original code is chewing through a few hundred meg of memory during the run. Not as bad as my initial maths suggested; I suspect it's only creating the large string once per go over the loop, not once per individual concatenation. It triggers 2-3 minor garbage collections during each run though, which is denting performance. It also uses 100%CPU on the core it's running on.

The ArrayAppend code triggers no GCs on average and you can hardly see the memory being used.

I can't see why the original code took quite so long, but other factors would be CPU speed, free memory, database being accessed (I used MySQL5), CF version etc.

like image 143
barnyr Avatar answered Sep 25 '22 02:09

barnyr