I saw the following example in Rich's video on sequences http://blip.tv/file/734409 about 33-36 minutes into it:
(first "abcd") => \a
Now, he say that this expands to (sort of):
(first "abcd") => (first (seq "abcd")) => (first '(\a \b \c \d))
So, it looks like an O(N)
operation because the full copy of the string is being made. First of all, if a String
is immutable, then why is it copied? (Edit: based on an answer, it probably is not; just looked that way when printed.) Secondly, suppose first
operated on something else in Java that is mutable, say a linked list of integers. Should first
act in a lazy manner (e.g. create a persistent sequence first)? Would not it make sense to evaluate it right away and save it? It would be some sort of a hack that would break the nice abstraction, but get the job done fast, I think. When you call (seq "abcd")
, you do not know how it will be used. When you call a first
on a seq
, you know what to do. But, when you call first
on "abcd"
, I think that performing a hacky and fast "grab and save it", approach is better than grab a sequence and then call first
.
Am I missing something? Did Rich Hickey skip some steps?
Let me know if I have questions. Thanks!
It doesn't follow that a full copy of the string is being made. (But this is a good question.)
In the source for clojure.lang.RT you'll notice that the runtime uses charsequence to create the sequence:
static ISeq seqFrom(Object coll){
if(coll instanceof Seqable)
return ((Seqable) coll).seq();
else if(coll == null)
return null;
else if(coll instanceof Iterable)
return IteratorSeq.create(((Iterable) coll).iterator());
else if(coll.getClass().isArray())
return ArraySeq.createFromObject(coll);
else if(coll instanceof CharSequence)
return StringSeq.create((CharSequence) coll);
.
.
.
So really, this is a java question, not a clojure question. I haven't checked, but I'm fairly certain that CharSequence does the "right thing".
The other answers are correct, but I'll use this chance to point out an interesting effect of Clojure's philosophy of immutability.
So, it looks like an O(N) operation because the full copy of the string is being made.
Strings, like other data structures in Clojure, are immutable. (They're a special case, being implemented in the JVM rather than Clojure, but that's immaterial to this point.)
Copies of immutable objects are free. That's right, free. Because you don't need to copy at all. The same allocated object in memory can simply be reused wholesale, because it's guaranteed to always be the same as when it was "copied."
So a function like seq
never has to actually copy anything. It just operates on whatever is passed, directly, and returns a (possibly lazy) sequence that provides an abstract interface to whatever you called seq
on.
And so, seq
is always O(1).
You might look at the seq
as if it creates a lazy sequence based on the string.
It doesn't actually create a lazy sequence, it actually short circuits to the java implementation of CharSequence, but that's an implementation detail.
From a complexity point of view, first
is O(1)
on a seq
, and you can create a seq
in constant time from anything is iterable.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With