Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

scala duplicate elements in list

I need to duplicate every element in a list. Here is what I came up with for that:

List.range(1,5).map(i => List(i,i)).flatten

which outputs

List[Int] = List(1, 1, 2, 2, 3, 3, 4, 4)

I wonder whether this is the most efficient way to do that (ultimately it needs to run on a lot of data), given that for every element, a new list would be created.

(the above is on an int range, to keep the example simple)

Any suggestions?

like image 480
benroth Avatar asked Jun 17 '14 02:06

benroth


People also ask

How do I add to a list in Scala?

This is the first method we use to append Scala List using the operator “:+”. The syntax we use in this method is; first to declare the list name and then use the ':+' method rather than the new element that will be appended in the list. The syntax looks like “List name:+ new elements”.

How do I create a list in Scala?

Syntax for defining a Scala List. val variable_name: List[type] = List(item1, item2, item3) or val variable_name = List(item1, item2, item3) A list in Scala is mostly like a Scala array. However, the Scala List is immutable and represents a linked list data structure.


2 Answers

A more general solution would be something like:

def duplicate[T](list: List[T]): List[T] = list.flatMap(x => List[T](x, x))

Using immutable collections won't be all that efficient for very large data sets. A simple implementation using the mutable ListBuffer is already 10 times faster than the above (using a list with one million elements):

def duplicate[T](list: List[T]): List[T] = {

    val buffer = collection.mutable.ListBuffer[T]()

    list.foreach{ x =>
        buffer += x
        buffer += x
    }

    buffer.toList
}

This uses a general technique of appending to a ListBuffer for performance, then converting to the immutable List at the end.

like image 80
Michael Zajac Avatar answered Sep 29 '22 03:09

Michael Zajac


Do you really need lists? can you do better by being more generic? Lists are often over-used when other collections can be much better suited. Here's a method which takes any Seq and creates a Stream which duplicates the items, Streams are naturally lazy, You won't necessarily have the memory overhead of creating and throwing away a lot of little lists:

def dupe[A](as: Seq[A]): Stream[A] = as match { 
  case Seq(h, t @ _*) => h #:: h #:: dupe(t)
  case _ => Stream.empty 
}

We can see that it acts lazily:

scala> dupe(List(1,2,3,4))
res1: Stream[Int] = Stream(1, ?)

Lazily enough that it works in very large or even infinite input:

scala> dupe(Stream.range(1, Int.MaxValue)).take(10).force
res2: scala.collection.immutable.Stream[Int] = Stream(1, 1, 2, 2, 3, 3, 4, 4, 5, 5)

scala> dupe(Stream.continually(1)).take(10).force
res3: scala.collection.immutable.Stream[Int] = Stream(1, 1, 1, 1, 1, 1, 1, 1, 1, 1)

if you really want a list:

scala> dupe(List(1,2,3,4)).toList
res5: List[Int] = List(1, 1, 2, 2, 3, 3, 4, 4)
like image 23
stew Avatar answered Sep 29 '22 05:09

stew