Answers to What are the differences between Rust's `String` and `str`? describe how &str
and String
relate to each other.
What is surprising is that a str
is more limited than a fixed-sized array, because it cannot be declared as a local variable. Compiling
let arr_owned = [0u8; 32];
let arr_slice = &arr_owned;
let str_slice = "apple";
let str_owned = *str_slice;
in Rust 1.32.0, I get
error[E0277]: the size for values of type `str` cannot be known at compilation time
--> src/lib.rs:6:9
which is confusing, because the size of "apple"
can be known by the compiler, it is just not part of the str
type.
Is there a linguistic reason for the asymmetry between Vec<T>
<-> [T; N]
and String
<-> str
owned types? Could an str[N]
type, which would be a shortand to a [u8; N]
that only contains provably valid UTF-8 encoded strings, replace str
without breaking lots of existing code?
Fixed arrays provide an easy way to allocate and use multiple variables of the same type so long as the length of the array is known at compile time.
Arrays are fixed size. Once we initialize the array with some int value as its size, it can't change.
int size; cout << "How big of an array?" ; cin >> size; char ch[size];
asymmetry between
Vec<T>
<->[T; N]
andString
<->str
That's because you confused something here. The relationships are rather like this:
Vec<T>
⇔ [T]
String
⇔ str
In all those four types, the length information is stored at runtime, not compile time. Fixed size arrays ([T; N]
) are different in that regard: they store the length at compile time, but not runtime!
And indeed, both [T]
and str
can't be stored on the stack, because they are both unsized.
Could an
str[N]
type, which would be a shorthand to a[u8; N]
that only contains provably valid UTF-8 encoded strings, replacestr
without breaking lots of existing code?
It wouldn't replace str
, but it could be an interesting addition indeed! But there are probably reasons why it doesn't exist yet, e.g. because the length of a Unicode string is usually not really relevant. In particular, it usually doesn't make sense to "take a Unicode string with exactly three bytes".
[T]
andstr
can't be stored on the stack, because they are both unsized
While this is true today, it may not be true in the future. RFC 1909 introduces unsized rvalues. One of the powers that this feature would give is variable-length arrays:
The RFC also describes an extension to the array literal syntax:
[e; dyn n]
. In the syntax,n
isn't necessarily a constant expression. The array is dynamically allocated on the stack
No mention is made of whether a string will be directly possible, but one could always create a stack-allocated array of bytes to be used as storage for a string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With