Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What exactly are strings in Nim?

From what I understand, strings in Nim are basically a mutable sequence of bytes and that they are copied on assignment.

Given that, I assumed that sizeof would tell me (like len) the number of bytes, but instead it always gives 8 on my 64-bit machine, so it seems to be holding a pointer.

Given that, I have the following questions...

  • What was the motivation behind copy on assignment? Is it because they're mutable?

  • Is there ever a time when it isn't copied when assigned? (I assume non-var function parameters don't copy. Anything else?)

  • Are they optimized such that they only actually get copied if/when they're mutated?

  • Is there any significant difference between a string and a sequence, or can the answers to the above questions be equally applied to all sequences?

  • Anything else in general worth noting?

Thank you!

like image 293
Lye Fish Avatar asked Apr 01 '15 20:04

Lye Fish


1 Answers

The definition of strings actually is in system.nim, just under another name:

type
  TGenericSeq {.compilerproc, pure, inheritable.} = object
    len, reserved: int
  PGenericSeq {.exportc.} = ptr TGenericSeq
  UncheckedCharArray {.unchecked.} = array[0..ArrayDummySize, char]
  # len and space without counting the terminating zero:
  NimStringDesc {.compilerproc, final.} = object of TGenericSeq
    data: UncheckedCharArray
  NimString = ptr NimStringDesc

So a string is a raw pointer to an object with a len, reserved and data field. The procs for strings are defined in sysstr.nim.

The semantics of string assignments have been chosen to be the same as for all value types (not ref or ptr) in Nim by default, so you can assume that assignments create a copy. When a copy is unneccessary, the compiler can leave it out, but I'm not sure how much that is happening so far. Passing strings into a proc doesn't copy them. There is no optimization that prevents string copies until they are mutated. Sequences behave in the same way.

You can change the default assignment behaviour of strings and seqs by marking them as shallow, then no copy is done on assignment:

var s = "foo"
shallow s
like image 78
def- Avatar answered Sep 19 '22 09:09

def-