This is how the <code>str</code> type is used: <pre class="prettyprint"><code>let hello = "Hello, world!"; // with an explicit type annotation let hello: &'static str = "Hello, world!"; </code></pre> <code>let hello: str = "Hello, world!";</code> leads to <code>expected `str`, found `&str`</code> Why is the default type of the text not just <code>str</code> unlike all primitive types, vectors, and <code>String</code>? Why is it a reference?

The design decision that strings and slices are only accessible via references has many advantages: <ol> <li>strings can have any length. So a variable of type <code>str</code> is not easily managed on the stack, while <code>&str</code> has just the size of a pointer on the stack (while the variable length data resides on the heap). Note that all other primitive types have a fixed length, every reference has a fixed length (not the data it is pointing to) and every struct (which is a composition).</li> <li> <code>&str</code> is an immutable reference. If you could define variables of type <code>str</code> you have to give semantics to <code>let mut s: str = "str";</code>. An immutable string on the stack is hard to manage, a string which could be appended is even harder.</li> <li>Owned <code>str</code> mean that every move would have to copy all chars, which costs performance. Just copying the reference and keeping the referenced data constant on the heap is cheaper. This is not really a zero-cost abstraction.</li> <li> <code>str</code> is not the only type that appears only as reference <code>&str</code> (same holds for slices, like <code>&[i8]</code>) so a change to the handling of strings would make other behavior odd (or it has to be changed accordingly).</li> <li>Let us assume that a function could manage variables of type <code>str</code>. Now you want to return a <code>&str</code> from this function. This cannot work because a reference lives at most as long as the value it points to (try this with any primitive type). Since <code>str</code> is a locally created value it cannot outlive the function. The convenience that a string literal is always a reference to a static string resolves this problem. This means that you will have to write additional code to put your owned <code>str</code> into a static variable, such that you could return <code>&str</code>. And since a static reference is the default behavior I need, it is quite convenient that I could write it with small overhead.</li> </ol>

I will try to give a different perspective. In Rust there is a general convention: if you have a variable of some type <code>T</code>, it means that you own the data associated with <code>T</code>. If you have a variable of type <code>&T</code>, then you don't own the data. Now let's consider a heap-allocated string. According to this convention, there should be a non-reference type that represents ownership of the allocation. And indeed such a type exists: <code>String</code>. There is also a different kind of strings: <code>&'static str</code>. These strings are not owned by anyone: exactly one instance of string is placed inside the compiled binary file, and only pointers are passed around. There is no allocation and no deallocation, hence no ownership. In a sense, static strings are owned by the compiler, not by a programmer. This is why <code>String</code> can not be used to represent a static string. Alright, so why not use <code>&String</code> to represent a static string? Imagine a world where the following code is a valid Rust: <pre class="prettyprint lang-rust prettyprint-override"><code>let s: &'static String = "hello, world!"; </code></pre> This might look fine, but implementation-wise, this is suboptimal: <ol> <li> <code>String</code> itself has a pointer to the actual data, so <code>&String</code> has to be basically a pointer to a pointer. This violates zero-cost abstraction principle: why do we introduce an excessive level of indirection, when actually the compiler statically knows the address of <code>"hello, world!"</code>?</li> <li> Even if somehow the compiler was smart enough to decide that an excessive pointer is not needed here (which would lead to a bunch of other problems), still <code>String</code> itself contains three 8-byte fields: <ul> <li>Data pointer;</li> <li>Data length;</li> <li>Allocation capacity - lets us know how much free space there is after the data.</li> </ul> However, when we are talking about static strings, capacity makes zero sense: static strings are read-only. </li> </ol> So, in the end, when the compiler sees <code>&'static String</code>, we actually want it to store only a data pointer and length - otherwise, we are paying for what we will never use, which is against zero-cost abstraction principle. This looks like an arcane wizardry that we want from the compiler: the variable type is <code>&String</code> but the variable itself is anything but a reference to <code>String</code>. To make this work, we actually need a different type, not <code>&String</code>, that only holds a data pointer and length. And here it is: <code>&str</code>! It is better than <code>&String</code> in a number of ways: <ol> <li>Does not have an excessive level of indirection - only one pointer;</li> <li>Does not store capacity, which would be meaningless in many contexts;</li> <li>No black magic: we define <code>str</code> as a variable-sized type (the data itself), so <code>&str</code> is just a reference to the data.</li> </ol> Now you might wonder: why not introduce <code>str</code> instead of <code>&str</code>? Remeber the convention: having <code>str</code> would imply that you own the data, which you don't. Hence <code>&str</code>.

Why does str primarily exist in it's borrowed form? [duplicate]

Tags:

string

types

reference

rust

borrowing

This is how the str type is used:

let hello = "Hello, world!";

// with an explicit type annotation
let hello: &'static str = "Hello, world!";

let hello: str = "Hello, world!"; leads to expected `str`, found `&str`

Why is the default type of the text not just str unlike all primitive types, vectors, and String? Why is it a reference?

442

asked Apr 12 '20 16:04

QurakNerd

2 Answers

The design decision that strings and slices are only accessible via references has many advantages:

strings can have any length. So a variable of type str is not easily managed on the stack, while &str has just the size of a pointer on the stack (while the variable length data resides on the heap). Note that all other primitive types have a fixed length, every reference has a fixed length (not the data it is pointing to) and every struct (which is a composition).
&str is an immutable reference. If you could define variables of type str you have to give semantics to let mut s: str = "str";. An immutable string on the stack is hard to manage, a string which could be appended is even harder.
Owned str mean that every move would have to copy all chars, which costs performance. Just copying the reference and keeping the referenced data constant on the heap is cheaper. This is not really a zero-cost abstraction.
str is not the only type that appears only as reference &str (same holds for slices, like &[i8]) so a change to the handling of strings would make other behavior odd (or it has to be changed accordingly).
Let us assume that a function could manage variables of type str. Now you want to return a &str from this function. This cannot work because a reference lives at most as long as the value it points to (try this with any primitive type). Since str is a locally created value it cannot outlive the function. The convenience that a string literal is always a reference to a static string resolves this problem. This means that you will have to write additional code to put your owned str into a static variable, such that you could return &str. And since a static reference is the default behavior I need, it is quite convenient that I could write it with small overhead.

189

answered Oct 16 '22 01:10

CoronA

I will try to give a different perspective. In Rust there is a general convention: if you have a variable of some type T, it means that you own the data associated with T. If you have a variable of type &T, then you don't own the data.

Now let's consider a heap-allocated string. According to this convention, there should be a non-reference type that represents ownership of the allocation. And indeed such a type exists: String.

There is also a different kind of strings: &'static str. These strings are not owned by anyone: exactly one instance of string is placed inside the compiled binary file, and only pointers are passed around. There is no allocation and no deallocation, hence no ownership. In a sense, static strings are owned by the compiler, not by a programmer. This is why String can not be used to represent a static string.

Alright, so why not use &String to represent a static string? Imagine a world where the following code is a valid Rust:

let s: &'static String = "hello, world!";

This might look fine, but implementation-wise, this is suboptimal:

String itself has a pointer to the actual data, so &String has to be basically a pointer to a pointer. This violates zero-cost abstraction principle: why do we introduce an excessive level of indirection, when actually the compiler statically knows the address of "hello, world!"?
Even if somehow the compiler was smart enough to decide that an excessive pointer is not needed here (which would lead to a bunch of other problems), still String itself contains three 8-byte fields:
- Data pointer;
- Data length;
- Allocation capacity - lets us know how much free space there is after the data.
However, when we are talking about static strings, capacity makes zero sense: static strings are read-only.

So, in the end, when the compiler sees &'static String, we actually want it to store only a data pointer and length - otherwise, we are paying for what we will never use, which is against zero-cost abstraction principle. This looks like an arcane wizardry that we want from the compiler: the variable type is &String but the variable itself is anything but a reference to String.

To make this work, we actually need a different type, not &String, that only holds a data pointer and length. And here it is: &str! It is better than &String in a number of ways:

Does not have an excessive level of indirection - only one pointer;
Does not store capacity, which would be meaningless in many contexts;
No black magic: we define str as a variable-sized type (the data itself), so &str is just a reference to the data.

Now you might wonder: why not introduce str instead of &str? Remeber the convention: having str would imply that you own the data, which you don't. Hence &str.

answered Oct 16 '22 03:10

kreo

Related questions
                            
                                String split on a number word pattern
                            
                                sscanf get the value of the remaining string
                            
                                C++ Incorrect behavior of std::string initialization
                            
                                In Swift 2.0 what is the maximum length of a string?
                            
                                How to Convert string("1.0000") to int
                            
                                Convert a String to binary in swift?
                            
                                How to do Byte Pair Encoding bigram counting and replacements efficiently in Python?
                            
                                How to custom sort an alphanumeric list?
                            
                                Performing a Regex search and Replace on a std::string
                            
                                Splitting string by delimiter in R [duplicate]
                            
                                Matlab repr function
                            
                                Javascript .match regular expression with reverse quantifier (or parse right to left)
                            
                                Replace strings using List Comprehensions
                            
                                Split and re-concatenate a string
                            
                                How to check parentheses validation [duplicate]
                            
                                Dynamic way to hide divs vanilla javascript
                            
                                Convert Stream to String in Java
                            
                                Parse HTML String to DOM and convert it back to string
                            
                                Split string with repeated delimiters
                            
                                Find the difference between strings for each two rows of pandas data.frame

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With