Helo Coding

Mostly about ruby, rails and enterprise integration

About Me

My photo
Aarhus, Denmark
Been working full-time as ruby developer since january 2007.

Tuesday, June 10, 2008

Memory Profiling Rails with ruby-prof

Background: We have an app which has a service for importing large amounts of data into our central dataproviding service. The import is pretty basic; accepts a CSV which is then parsed and AR model objects are created.

The service wasn't that quick and it seemed as though it had bad performance when importing a five digit number of objects. So we decided to profile it.

From the blogosphere I'd learned that quite a few people were using ruby-prof so I decided to check that one out. Ruby-prof is open source and can be found here: http://ruby-prof.rubyforge.org/

I decided to focus on memory since our problems seemed to increase with the number of objects being handled leading us to suspect GC to take more and more time. Why explains GC and mentions typical mistakes here: http://whytheluckystiff.net/articles/theFullyUpturnedBin.html

Nimble's post http://blog.pluron.com/2008/01/ruby-on-rails-i.html is nice and covers some easy gains in rails regarding ruby GC and delivers specific patches.

From his section

'Will it help me?

In general your code will benefit most if you:

* Create or update a lot of rows using ActiveRecord
* Run large number of SQL statements of any kind using ActiveRecord, e.g. load many AR object through separate #find calls'

I found that this applied pretty well to our situation since we both create a lot of objects and use finders to connect our relational models.

So after installing ruby-prof and patching my 1.8.6 mri, I started profiling our app.

My testcase was simple. I'd use find to get a number of carrots and then I'd start washing them.

After playing around I found that memory consumption increased more than linear with the number of carrots and found 1000 carrots for my benchmark to be appropriate. Enough to give me some data to work with and not too many for the profiling to take forever.

My first profile showed that washing 1000 carrots resulted in 506706 kb allocated memory. The allocation was divided between 12 methods and seems like pretty much.

Looking through my HTMLGraph I found that 261903 kb(over half!) of the allocation was by a find_by_carrot_id method using :include. So I added a param that defaulted 'use_includes' to 'true' and added a check in that finder method allowing me to pass 'false' and not get the information that I don't need =) this hack lets other invokers use the method as they always have since the default is not necessary to specify and defaults to using the includes.

A new profiling report to see how that turned out. The total allocation was now 338359 kb so that simple change cut off ~33%.

For my next change I went to check a known rails bad guy: Benchmark#realtime. The rails version we're using, surrounds every sql call with Benchmark#realtime adding unnecessary allocation. In my report realtime allocated 231399 kb of which 65992 kb was allocated by itself.

After adding Nimble's patch I made a new report. The improvement was worth the wait =) the new numbers were 159747 total and 1 kb by realtime itself! Now the app total was down to 258417 kb with a reduction of ~23% comparing to the last report.

Now I started looking closer on our own app again. What I noticed was self made rule engine substituting carrot attributes based on rules stored in the database. This is a light weight and dynamic approach that lets users define substitutions in an easy way.
Even though there were no rules in the test database and still the find_all_by_applicable_to method allocated 64528.00 kb though. So I introduced a cache that quickly expires, yet reduces the method invocations dramatically during these batch jobs.

The simple caching method reduced that methods allocation to 218 kb! Taking the app total to 191302 another nice reduction of ~22%

All in all the app memory allocation was reduced from 506706 kb to 191302 kb - a reduction of ~62% and more important: running the test with 5000 carrots showed an allocation of 892489 kb which is 4,67 times higher showing a slight decrease in memory allocation with increasing number of carrots.