Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby: Should strings in a frozen array also be individually frozen?

Ruby 2.2.3, Rails 4.2.1

I have a large number of arrays of strings that I define as constants for use throughout an application. They are various sets of ISO country codes, language codes, that sort of thing, so two to four characters, numbered in the hundreds of unique values each.

The different arrays are collections of these, so NORTH_AMERICA_COUNTRY_CODES might be an array of a dozen of so country codes, AFRICA_COUNTRY_CODES might be an array of around 60. Many of them overlap (various versions of the Commonwealth countries, for example).

The arrays are used in comparison with other arbitrary arrays of country codes for logic such as "Subtract this list of countries from Africa".

So I'm wondering whether, when I generate these constants, I ought to freeze the strings within the arrays, so instead of:

WORLD_COUNTRIES = Countries.pluck(:country_code).freeze

... maybe ...

WORLD_COUNTRIES = Countries.pluck(:country_code).map{|c| c.freeze}.freeze

Is there a way of quantifying the potential benefits?

I considered using arrays of symbols instead of arrays of strings, but the arbitrary arrays that these are used with are stored in PostgreSQL text arrays, and it seems like I'd need to serialise those columns instead, or maybe override the getter and setter methods to change the values between arrays of strings and arrays of symbols. Ugh.

Edit

Testing results, in which I've attempted to benchmark three situations:

  1. Comparing a frozen array of unfrozen strings with an unfrozen array of unfrozen strings
  2. Comparing a frozen array of frozen strings with an unfrozen array of unfrozen strings
  3. Comparing a frozen array of symbols with an unfrozen array of symbols (in case I bit the bullet and went all symbolic on this).

Any thoughts on methodology or interpretation gratefully received. I'm not sure whether the similarity in results between the first two indicates that they are in all respects the same, but I'd be keen on anything that can directly point to differences in memory allocation.

Script:

require 'benchmark'

country_list             = ["AD", "AE", "AF", "AG", "AI", "AL", "AM", "AN", "AO", "AQ", "AR", "AS", "AT", "AU", "AW", "AX", "AZ", "BA", "BB", "BD", "BE", "BF", "BG", "BH", "BI", "BJ", "BL", "BM", "BN", "BO", "BQ", "BR", "BS", "BT", "BV", "BW", "BY", "BZ", "CA", "CC", "CD", "CF", "CG", "CH", "CI", "CK", "CL", "CM", "CN", "CO", "CR", "CS", "CU", "CV", "CW", "CX", "CY", "CZ", "DE", "DJ", "DK", "DM", "DO", "DZ", "EC", "EE", "EG", "EH", "ER", "ES", "ET", "FI", "FJ", "FK", "FM", "FO", "FR", "GA", "GB", "GD", "GE", "GF", "GG", "GH", "GI", "GL", "GM", "GN", "GP", "GQ", "GR", "GS", "GT", "GU", "GW", "GY", "HK", "HM", "HN", "HR", "HT", "HU", "ID", "IE", "IL", "IM", "IN", "IO", "IQ", "IR", "IS", "IT", "JE", "JM", "JO", "JP", "KE", "KG", "KH", "KI", "KM", "KN", "KP", "KR", "KW", "KY", "KZ", "LA", "LB", "LC", "LI", "LK", "LR", "LS", "LT", "LU", "LV", "LY", "MA", "MC", "MD", "ME", "MF", "MG", "MH", "MK", "ML", "MM", "MN", "MO", "MP", "MQ", "MR", "MS", "MT", "MU", "MV", "MW", "MX", "MY", "MZ", "NA", "NC", "NE", "NF", "NG", "NI", "NL", "NO", "NP", "NR", "NU", "NZ", "OM", "PA", "PE", "PF", "PG", "PH", "PK", "PL", "PM", "PN", "PR", "PS", "PT", "PW", "PY", "QA", "RE", "RO", "RS", "RU", "RW", "SA", "SB", "SC", "SD", "SE", "SG", "SH", "SI", "SJ", "SK", "SL", "SM", "SN", "SO", "SR", "SS", "ST", "SV", "SX", "SY", "SZ", "TC", "TD", "TF", "TG", "TH", "TJ", "TK", "TL", "TM", "TN", "TO", "TR", "TT", "TV", "TW", "TZ", "UA", "UG", "UM", "US", "UY", "UZ", "VA", "VC", "VE", "VG", "VI", "VN", "VU", "WF", "WS", "YE", "YT", "YU", "ZA", "ZM", "ZW"] 
FROZEN_ARRAY             = country_list.dup.freeze
puts FROZEN_ARRAY.size
FROZEN_ARRAY_AND_STRINGS = country_list.dup.map{|x| x.freeze}.freeze
FROZEN_ARRAY_AND_SYMBOLS = country_list.dup.map{|x| x.to_sym}.freeze
comp_s                   = %w(AD AT BE CY EE FI FR DE ES GR IE IT LU LV MC ME MT NL PT SI SK SM VA)
comp_sym                 = %w(AD AT BE CY EE FI FR DE ES GR IE IT LU LV MC ME MT NL PT SI SK SM VA).map{|x| x.to_sym}


Benchmark.bm do |x|
  x.report("frozen string"  )  { 10000.times {|i| c = (FROZEN_ARRAY_AND_STRINGS & comp_s.dup) }}
  x.report("unfrozen string")  { 10000.times {|i| c = (FROZEN_ARRAY             & comp_s.dup) }}
  x.report("symbols"        )  { 10000.times {|i| c = (FROZEN_ARRAY_AND_SYMBOLS & comp_sym.dup) }}
end


Benchmark.bmbm do |x|
  x.report("frozen string"  )  { 10000.times {|i| c = (FROZEN_ARRAY_AND_STRINGS & comp_s.dup) }}
  x.report("unfrozen string")  { 10000.times {|i| c = (FROZEN_ARRAY             & comp_s.dup) }}
  x.report("symbols"        )  { 10000.times {|i| c = (FROZEN_ARRAY_AND_SYMBOLS & comp_sym.dup) }}
end

Result:

2.2.3 :001 > require 'benchmark'
 => false 
2.2.3 :002 > 
2.2.3 :003 >   country_list             = ["AD", "AE", "AF", "AG", "AI", "AL", "AM", "AN", "AO", "AQ", "AR", "AS", "AT", "AU", "AW", "AX", "AZ", "BA", "BB", "BD", "BE", "BF", "BG", "BH", "BI", "BJ", "BL", "BM", "BN", "BO", "BQ", "BR", "BS", "BT", "BV", "BW", "BY", "BZ", "CA", "CC", "CD", "CF", "CG", "CH", "CI", "CK", "CL", "CM", "CN", "CO", "CR", "CS", "CU", "CV", "CW", "CX", "CY", "CZ", "DE", "DJ", "DK", "DM", "DO", "DZ", "EC", "EE", "EG", "EH", "ER", "ES", "ET", "FI", "FJ", "FK", "FM", "FO", "FR", "GA", "GB", "GD", "GE", "GF", "GG", "GH", "GI", "GL", "GM", "GN", "GP", "GQ", "GR", "GS", "GT", "GU", "GW", "GY", "HK", "HM", "HN", "HR", "HT", "HU", "ID", "IE", "IL", "IM", "IN", "IO", "IQ", "IR", "IS", "IT", "JE", "JM", "JO", "JP", "KE", "KG", "KH", "KI", "KM", "KN", "KP", "KR", "KW", "KY", "KZ", "LA", "LB", "LC", "LI", "LK", "LR", "LS", "LT", "LU", "LV", "LY", "MA", "MC", "MD", "ME", "MF", "MG", "MH", "MK", "ML", "MM", "MN", "MO", "MP", "MQ", "MR", "MS", "MT", "MU", "MV", "MW", "MX", "MY", "MZ", "NA", "NC", "NE", "NF", "NG", "NI", "NL", "NO", "NP", "NR", "NU", "NZ", "OM", "PA", "PE", "PF", "PG", "PH", "PK", "PL", "PM", "PN", "PR", "PS", "PT", "PW", "PY", "QA", "RE", "RO", "RS", "RU", "RW", "SA", "SB", "SC", "SD", "SE", "SG", "SH", "SI", "SJ", "SK", "SL", "SM", "SN", "SO", "SR", "SS", "ST", "SV", "SX", "SY", "SZ", "TC", "TD", "TF", "TG", "TH", "TJ", "TK", "TL", "TM", "TN", "TO", "TR", "TT", "TV", "TW", "TZ", "UA", "UG", "UM", "US", "UY", "UZ", "VA", "VC", "VE", "VG", "VI", "VN", "VU", "WF", "WS", "YE", "YT", "YU", "ZA", "ZM", "ZW"] 
 => ["AD", "AE", "AF", "AG", "AI", "AL", "AM", "AN", "AO", "AQ", "AR", "AS", "AT", "AU", "AW", "AX", "AZ", "BA", "BB", "BD", "BE", "BF", "BG", "BH", "BI", "BJ", "BL", "BM", "BN", "BO", "BQ", "BR", "BS", "BT", "BV", "BW", "BY", "BZ", "CA", "CC", "CD", "CF", "CG", "CH", "CI", "CK", "CL", "CM", "CN", "CO", "CR", "CS", "CU", "CV", "CW", "CX", "CY", "CZ", "DE", "DJ", "DK", "DM", "DO", "DZ", "EC", "EE", "EG", "EH", "ER", "ES", "ET", "FI", "FJ", "FK", "FM", "FO", "FR", "GA", "GB", "GD", "GE", "GF", "GG", "GH", "GI", "GL", "GM", "GN", "GP", "GQ", "GR", "GS", "GT", "GU", "GW", "GY", "HK", "HM", "HN", "HR", "HT", "HU", "ID", "IE", "IL", "IM", "IN", "IO", "IQ", "IR", "IS", "IT", "JE", "JM", "JO", "JP", "KE", "KG", "KH", "KI", "KM", "KN", "KP", "KR", "KW", "KY", "KZ", "LA", "LB", "LC", "LI", "LK", "LR", "LS", "LT", "LU", "LV", "LY", "MA", "MC", "MD", "ME", "MF", "MG", "MH", "MK", "ML", "MM", "MN", "MO", "MP", "MQ", "MR", "MS", "MT", "MU", "MV", "MW", "MX", "MY", "MZ", "NA", "NC", "NE", "NF", "NG", "NI", "NL", "NO", "NP", "NR", "NU", "NZ", "OM", "PA", "PE", "PF", "PG", "PH", "PK", "PL", "PM", "PN", "PR", "PS", "PT", "PW", "PY", "QA", "RE", "RO", "RS", "RU", "RW", "SA", "SB", "SC", "SD", "SE", "SG", "SH", "SI", "SJ", "SK", "SL", "SM", "SN", "SO", "SR", "SS", "ST", "SV", "SX", "SY", "SZ", "TC", "TD", "TF", "TG", "TH", "TJ", "TK", "TL", "TM", "TN", "TO", "TR", "TT", "TV", "TW", "TZ", "UA", "UG", "UM", "US", "UY", "UZ", "VA", "VC", "VE", "VG", "VI", "VN", "VU", "WF", "WS", "YE", "YT", "YU", "ZA", "ZM", "ZW"] 
2.2.3 :004 > FROZEN_ARRAY             = country_list.dup.freeze
 => ["AD", "AE", "AF", "AG", "AI", "AL", "AM", "AN", "AO", "AQ", "AR", "AS", "AT", "AU", "AW", "AX", "AZ", "BA", "BB", "BD", "BE", "BF", "BG", "BH", "BI", "BJ", "BL", "BM", "BN", "BO", "BQ", "BR", "BS", "BT", "BV", "BW", "BY", "BZ", "CA", "CC", "CD", "CF", "CG", "CH", "CI", "CK", "CL", "CM", "CN", "CO", "CR", "CS", "CU", "CV", "CW", "CX", "CY", "CZ", "DE", "DJ", "DK", "DM", "DO", "DZ", "EC", "EE", "EG", "EH", "ER", "ES", "ET", "FI", "FJ", "FK", "FM", "FO", "FR", "GA", "GB", "GD", "GE", "GF", "GG", "GH", "GI", "GL", "GM", "GN", "GP", "GQ", "GR", "GS", "GT", "GU", "GW", "GY", "HK", "HM", "HN", "HR", "HT", "HU", "ID", "IE", "IL", "IM", "IN", "IO", "IQ", "IR", "IS", "IT", "JE", "JM", "JO", "JP", "KE", "KG", "KH", "KI", "KM", "KN", "KP", "KR", "KW", "KY", "KZ", "LA", "LB", "LC", "LI", "LK", "LR", "LS", "LT", "LU", "LV", "LY", "MA", "MC", "MD", "ME", "MF", "MG", "MH", "MK", "ML", "MM", "MN", "MO", "MP", "MQ", "MR", "MS", "MT", "MU", "MV", "MW", "MX", "MY", "MZ", "NA", "NC", "NE", "NF", "NG", "NI", "NL", "NO", "NP", "NR", "NU", "NZ", "OM", "PA", "PE", "PF", "PG", "PH", "PK", "PL", "PM", "PN", "PR", "PS", "PT", "PW", "PY", "QA", "RE", "RO", "RS", "RU", "RW", "SA", "SB", "SC", "SD", "SE", "SG", "SH", "SI", "SJ", "SK", "SL", "SM", "SN", "SO", "SR", "SS", "ST", "SV", "SX", "SY", "SZ", "TC", "TD", "TF", "TG", "TH", "TJ", "TK", "TL", "TM", "TN", "TO", "TR", "TT", "TV", "TW", "TZ", "UA", "UG", "UM", "US", "UY", "UZ", "VA", "VC", "VE", "VG", "VI", "VN", "VU", "WF", "WS", "YE", "YT", "YU", "ZA", "ZM", "ZW"] 
2.2.3 :005 > puts FROZEN_ARRAY.size
252
 => nil 
2.2.3 :006 > FROZEN_ARRAY_AND_STRINGS = country_list.dup.map{|x| x.freeze}.freeze
 => ["AD", "AE", "AF", "AG", "AI", "AL", "AM", "AN", "AO", "AQ", "AR", "AS", "AT", "AU", "AW", "AX", "AZ", "BA", "BB", "BD", "BE", "BF", "BG", "BH", "BI", "BJ", "BL", "BM", "BN", "BO", "BQ", "BR", "BS", "BT", "BV", "BW", "BY", "BZ", "CA", "CC", "CD", "CF", "CG", "CH", "CI", "CK", "CL", "CM", "CN", "CO", "CR", "CS", "CU", "CV", "CW", "CX", "CY", "CZ", "DE", "DJ", "DK", "DM", "DO", "DZ", "EC", "EE", "EG", "EH", "ER", "ES", "ET", "FI", "FJ", "FK", "FM", "FO", "FR", "GA", "GB", "GD", "GE", "GF", "GG", "GH", "GI", "GL", "GM", "GN", "GP", "GQ", "GR", "GS", "GT", "GU", "GW", "GY", "HK", "HM", "HN", "HR", "HT", "HU", "ID", "IE", "IL", "IM", "IN", "IO", "IQ", "IR", "IS", "IT", "JE", "JM", "JO", "JP", "KE", "KG", "KH", "KI", "KM", "KN", "KP", "KR", "KW", "KY", "KZ", "LA", "LB", "LC", "LI", "LK", "LR", "LS", "LT", "LU", "LV", "LY", "MA", "MC", "MD", "ME", "MF", "MG", "MH", "MK", "ML", "MM", "MN", "MO", "MP", "MQ", "MR", "MS", "MT", "MU", "MV", "MW", "MX", "MY", "MZ", "NA", "NC", "NE", "NF", "NG", "NI", "NL", "NO", "NP", "NR", "NU", "NZ", "OM", "PA", "PE", "PF", "PG", "PH", "PK", "PL", "PM", "PN", "PR", "PS", "PT", "PW", "PY", "QA", "RE", "RO", "RS", "RU", "RW", "SA", "SB", "SC", "SD", "SE", "SG", "SH", "SI", "SJ", "SK", "SL", "SM", "SN", "SO", "SR", "SS", "ST", "SV", "SX", "SY", "SZ", "TC", "TD", "TF", "TG", "TH", "TJ", "TK", "TL", "TM", "TN", "TO", "TR", "TT", "TV", "TW", "TZ", "UA", "UG", "UM", "US", "UY", "UZ", "VA", "VC", "VE", "VG", "VI", "VN", "VU", "WF", "WS", "YE", "YT", "YU", "ZA", "ZM", "ZW"] 
2.2.3 :007 > FROZEN_ARRAY_AND_SYMBOLS = country_list.dup.map{|x| x.to_sym}.freeze
 => [:AD, :AE, :AF, :AG, :AI, :AL, :AM, :AN, :AO, :AQ, :AR, :AS, :AT, :AU, :AW, :AX, :AZ, :BA, :BB, :BD, :BE, :BF, :BG, :BH, :BI, :BJ, :BL, :BM, :BN, :BO, :BQ, :BR, :BS, :BT, :BV, :BW, :BY, :BZ, :CA, :CC, :CD, :CF, :CG, :CH, :CI, :CK, :CL, :CM, :CN, :CO, :CR, :CS, :CU, :CV, :CW, :CX, :CY, :CZ, :DE, :DJ, :DK, :DM, :DO, :DZ, :EC, :EE, :EG, :EH, :ER, :ES, :ET, :FI, :FJ, :FK, :FM, :FO, :FR, :GA, :GB, :GD, :GE, :GF, :GG, :GH, :GI, :GL, :GM, :GN, :GP, :GQ, :GR, :GS, :GT, :GU, :GW, :GY, :HK, :HM, :HN, :HR, :HT, :HU, :ID, :IE, :IL, :IM, :IN, :IO, :IQ, :IR, :IS, :IT, :JE, :JM, :JO, :JP, :KE, :KG, :KH, :KI, :KM, :KN, :KP, :KR, :KW, :KY, :KZ, :LA, :LB, :LC, :LI, :LK, :LR, :LS, :LT, :LU, :LV, :LY, :MA, :MC, :MD, :ME, :MF, :MG, :MH, :MK, :ML, :MM, :MN, :MO, :MP, :MQ, :MR, :MS, :MT, :MU, :MV, :MW, :MX, :MY, :MZ, :NA, :NC, :NE, :NF, :NG, :NI, :NL, :NO, :NP, :NR, :NU, :NZ, :OM, :PA, :PE, :PF, :PG, :PH, :PK, :PL, :PM, :PN, :PR, :PS, :PT, :PW, :PY, :QA, :RE, :RO, :RS, :RU, :RW, :SA, :SB, :SC, :SD, :SE, :SG, :SH, :SI, :SJ, :SK, :SL, :SM, :SN, :SO, :SR, :SS, :ST, :SV, :SX, :SY, :SZ, :TC, :TD, :TF, :TG, :TH, :TJ, :TK, :TL, :TM, :TN, :TO, :TR, :TT, :TV, :TW, :TZ, :UA, :UG, :UM, :US, :UY, :UZ, :VA, :VC, :VE, :VG, :VI, :VN, :VU, :WF, :WS, :YE, :YT, :YU, :ZA, :ZM, :ZW] 
2.2.3 :008 > comp_s                   = %w(AD AT BE CY EE FI FR DE ES GR IE IT LU LV MC ME MT NL PT SI SK SM VA)
 => ["AD", "AT", "BE", "CY", "EE", "FI", "FR", "DE", "ES", "GR", "IE", "IT", "LU", "LV", "MC", "ME", "MT", "NL", "PT", "SI", "SK", "SM", "VA"] 
2.2.3 :009 > comp_sym                 = %w(AD AT BE CY EE FI FR DE ES GR IE IT LU LV MC ME MT NL PT SI SK SM VA).map{|x| x.to_sym}
 => [:AD, :AT, :BE, :CY, :EE, :FI, :FR, :DE, :ES, :GR, :IE, :IT, :LU, :LV, :MC, :ME, :MT, :NL, :PT, :SI, :SK, :SM, :VA] 
2.2.3 :010 > 
2.2.3 :011 >   
2.2.3 :012 >   Benchmark.bm do |x|
2.2.3 :013 >       x.report("frozen string"  )  { 10000.times {|i| c = (FROZEN_ARRAY_AND_STRINGS & comp_s.dup) }}
2.2.3 :014?>     x.report("unfrozen string")  { 10000.times {|i| c = (FROZEN_ARRAY             & comp_s.dup) }}
2.2.3 :015?>     x.report("symbols"        )  { 10000.times {|i| c = (FROZEN_ARRAY_AND_SYMBOLS & comp_sym.dup) }}
2.2.3 :016?>   end
       user     system      total        real
frozen string  0.190000   0.000000   0.190000 (  0.194141)
unfrozen string  0.170000   0.010000   0.180000 (  0.174675)
symbols  0.080000   0.000000   0.080000 (  0.081507)
 => [#<Benchmark::Tms:0x007f810c3aca70 @label="frozen string", @real=0.1941408810671419, @cstime=0.0, @cutime=0.0, @stime=0.0, @utime=0.1899999999999995, @total=0.1899999999999995>, #<Benchmark::Tms:0x007f810c82b538 @label="unfrozen string", @real=0.1746752569451928, @cstime=0.0, @cutime=0.0, @stime=0.010000000000000009, @utime=0.16999999999999993, @total=0.17999999999999994>, #<Benchmark::Tms:0x007f810af2cfa0 @label="symbols", @real=0.08150708093307912, @cstime=0.0, @cutime=0.0, @stime=0.0, @utime=0.08000000000000007, @total=0.08000000000000007>] 
2.2.3 :017 > 
2.2.3 :018 >   
2.2.3 :019 >   Benchmark.bmbm do |x|
2.2.3 :020 >       x.report("frozen string"  )  { 10000.times {|i| c = (FROZEN_ARRAY_AND_STRINGS & comp_s.dup) }}
2.2.3 :021?>     x.report("unfrozen string")  { 10000.times {|i| c = (FROZEN_ARRAY             & comp_s.dup) }}
2.2.3 :022?>     x.report("symbols"        )  { 10000.times {|i| c = (FROZEN_ARRAY_AND_SYMBOLS & comp_sym.dup) }}
2.2.3 :023?>   end
Rehearsal ---------------------------------------------------
frozen string     0.180000   0.000000   0.180000 (  0.183846)
unfrozen string   0.200000   0.000000   0.200000 (  0.196311)
symbols           0.080000   0.000000   0.080000 (  0.082794)
------------------------------------------ total: 0.460000sec

                      user     system      total        real
frozen string     0.160000   0.000000   0.160000 (  0.167051)
unfrozen string   0.170000   0.000000   0.170000 (  0.171601)
symbols           0.080000   0.000000   0.080000 (  0.078746)
 => [#<Benchmark::Tms:0x007f811022a388 @label="frozen string", @real=0.1670510449912399, @cstime=0.0, @cutime=0.0, @stime=0.0, @utime=0.16000000000000014, @total=0.16000000000000014>, #<Benchmark::Tms:0x007f811022a4c8 @label="unfrozen string", @real=0.17160122003406286, @cstime=0.0, @cutime=0.0, @stime=0.0, @utime=0.16999999999999993, @total=0.16999999999999993>, #<Benchmark::Tms:0x007f8108eb1c58 @label="symbols", @real=0.07874645793344826, @cstime=0.0, @cutime=0.0, @stime=0.0, @utime=0.08000000000000007, @total=0.08000000000000007>] 
2.2.3 :024 > 
2.2.3 :025 >   
2.2.3 :026 >   
like image 964
David Aldridge Avatar asked Nov 10 '22 01:11

David Aldridge


1 Answers

Since you're dealing with a small number of values, and since the performance benefits of symbols are evident from your testing, just go with symbols.

BTW, you can use map(&:to_sym) instead of map {|x| x.to_sym}.

like image 50
Chris Jester-Young Avatar answered Nov 15 '22 03:11

Chris Jester-Young