Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Memory Leak in Ruby net/ldap Module

As part of my Rails application, I've written a little importer that sucks in data from our LDAP system and crams it into a User table. Unfortunately, the LDAP-related code leaks huge amounts of memory while iterating over our 32K users, and I haven't been able to figure out how to fix the issue.

The problem seems to be related to the LDAP library in some way, as when I remove the calls to the LDAP stuff, memory usage stabilizes nicely. Further, the objects that are proliferating are Net::BER::BerIdentifiedString and Net::BER::BerIdentifiedArray, both part of the LDAP library.

When I run the import, memory usage eventually peaks at over 1GB. I need to find some way to correct my code if the problem is there, or to work around the LDAP memory issues if that's where the problem lies. (Or if there's a better LDAP library for large imports for Ruby, I'm open to that as well.)

Here's the pertinent bit of our my code:

require 'net/ldap'
require 'pp'

class User < ActiveRecord::Base
  validates_presence_of :name, :login, :email

  # This method is resonsible for populating the User table with the
  # login, name, and email of anybody who might be using the system.
  def self.import_all
    # initialization stuff. set bind_dn, bind_pass, ldap_host, base_dn and filter

    ldap = Net::LDAP.new
    ldap.host = ldap_host
    ldap.auth bind_dn, bind_pass
    ldap.bind

    begin
      # Build the list
      records = records_updated = new_records = 0
      ldap.search(:base => base_dn, :filter => filter ) do |entry|
        name = entry.givenName.to_s.strip + " " + entry.sn.to_s.strip
        login = entry.name.to_s.strip
        email = login + "@txstate.edu"
        user = User.find_or_initialize_by_login :name => name, :login => login, :email => email
        if user.name != name
          user.name = name
          user.save
          logger.info( "Updated: " + email )
          records_updated = records_updated + 1
        elsif user.new_record?
          user.save
          new_records = new_records + 1
        else
          # update timestamp so that we can delete old records later
          user.touch
        end
        records = records + 1
      end

      # delete records that haven't been updated for 7 days
      records_deleted = User.destroy_all( ["updated_at < ?", Date.today - 7 ] ).size

      logger.info( "LDAP Import Complete: " + Time.now.to_s )
      logger.info( "Total Records Processed: " + records.to_s )
      logger.info( "New Records: " + new_records.to_s )
      logger.info( "Updated Records: " + records_updated.to_s ) 
      logger.info( "Deleted Records: " + records_deleted.to_s )

    end

  end
end

Thanks in advance for any help/pointers!

By the way, I did ask about this in the net/ldap support forum as well, but didn't get any useful pointers there.

like image 238
Sean McMains Avatar asked Jul 23 '10 16:07

Sean McMains


Video Answer


1 Answers

One very important thing to note is that you never use the result of the method call. That means that you should pass :return_result => false to ldap.search:

ldap.search(:base => base_dn, :filter => filter, :return_result => false ) do |entry|

From the docs: "When :return_result => false, #search will return only a Boolean, to indicate whether the operation succeeded. This can improve performance with very large result sets, because the library can discard each entry from memory after your block processes it."

In other words, if you don't use this flag, all entries will be stored in memory, even if you do not need them outside the block! So, use this option.

like image 152
Daniel Abrahamsson Avatar answered Sep 28 '22 02:09

Daniel Abrahamsson