How can I find unique elements in a string in the best way?
Sample string format is
myString = "34345667543"
o/p
['3','4','3','5'.....]
The number of unique values in a list is the number of elements excluding duplicates. For example, the number of unique values in [1, 1, 2, 2, 3] is 3.
Python3. If not present append the characters to empty string. Now the empty string consists of only unique characters, use len() method to display the length.
This is an interesting question, and since it returns so many almost similar results, I did a simple benchmark to decide which is actually the best solution:
require 'rubygems'
require 'benchmark'
require 'set'
puts "Do the test"
Benchmark.bm(40) do |x|
STRING_TEST = "26263636362626218118181111232112233"
x.report("do split and uniq") do
(1..1000000).each { STRING_TEST.split(//).uniq }
end
x.report("do chars to_a uniq") do
(1..1000000).each { STRING_TEST.chars.to_a.uniq }
end
x.report("using Set") do
(1..1000000).each { Set.new(STRING_TEST.split('')).to_a }
end
end
and the results of this test are, not entirely surprising (0n 1.8.7p352):
user system total real
do split and uniq 27.060000 0.000000 27.060000 ( 27.084629)
do chars to_a uniq 14.440000 0.000000 14.440000 ( 14.452377)
using Set 41.740000 0.000000 41.740000 ( 41.760313)
and on 1.9.2p180 :
user system total real
do split and uniq 19.260000 0.000000 19.260000 ( 19.242727)
do chars to_a uniq 8.980000 0.010000 8.990000 ( 8.983891)
using Set 28.220000 0.000000 28.220000 ( 28.186787)
The results for REE (1.8.7) are close to 1.9.2 :
user system total real
do split and uniq 19.120000 0.000000 19.120000 ( 19.126034)
do chars to_a uniq 14.740000 0.010000 14.750000 ( 14.766540)
using Set 32.770000 0.120000 32.890000 ( 32.921878)
For fun, I also tried on rubinius:
user system total real
do split and uniq 26.100000 0.000000 26.100000 ( 26.651468)
do chars to_a uniq 25.680000 0.000000 25.680000 ( 25.780944)
using Set 22.500000 0.000000 22.500000 ( 22.649291)
So while the split('\\').uniq
wins points for readability, the chars.to_a.uniq
is almost double as fast.
It is weird to notice that on rubinius the Set
solution is the fastest, but no where near as fast as the chars.to_a.uniq
on 1.9.2.
Use this short:
myString.split(//).uniq
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With