The Rust Guide states that:
The semicolon turns any expression into a statement by throwing away its value and returning unit instead.
I thought I got this concept down until I ran an experiment:
fn print_number(x: i32, y: i32) -> i32 {
if x + y > 20 { return x }
x + y
}
Which compiles fine. Then, I added a semicolon at the end of the return line (return x;
). From what I understand, this turns the line into a statement, returning the unit data type ()
.
Nonetheless, the end result is the same.
Normally, every branch in the if
expression should have the same type. If the type for some branch is underspecified, the compiler tries to find the single common type:
fn print_number(x: int, y: int) {
let v = if x + y > 20 {
3 // this can be either 3u, 3i, 3u8 etc.
} else {
x + y // this is always int
};
println!("{}", v);
}
In this code, 3
is underspecified but the else
branch forces it to have the type of int
.
This sounds simple: There is a function that "unifies" two or more types into the common type, or it will give you an error when that's not possible. But what if there were a fail!
in the branch?
fn print_number(x: int, y: int) {
let v = if x + y > 20 {
fail!("x + y too large") // ???
} else {
x + y // this is always int
};
println!("{}", v); // uh wait, what's the type of `v`?
}
I'd want that fail!
does not affect other branches, it is an exceptional case after all. Since this pattern is quite common in Rust, the concept of diverging type has been introduced. There is no value which type is diverging. (It is also called an "uninhabited type" or "void type" depending on the context. Not to be confused with the "unit type" which has a single value of ()
.) Since the diverging type is naturally a subset of any other types, the compiler conclude that v
's type is just that of the else
branch, int
.
Return
expression is no different from fail!
for the purpose of type checking. It abruptly escapes from the current flow of execution just like fail!
(but does not terminate the task, thankfully). Still, the diverging type does not propagate to the next statement:
fn print_number(x: int, y: int) {
let v = if x + y > 20 {
return; // this is diverging
() // this is implied, even when you omit it
} else {
x + y // this is always int
};
println!("{}", v); // again, what's the type of `v`?
}
Note that the sole semicoloned statement x;
is equivalent to the expression x; ()
. Normally a; b
has the same type as b
, so it would be quite strange that x; ()
has a type of ()
only when x
is not diverging, and it diverges when x
does diverge. That's why your original code didn't work.
It is tempting to add a special case like that:
x; ()
diverging when x
diverges?uint
for every underspecified integer literal when its type cannot be inferred? (Note: this was the case in the past.)The truth is that, designing the type system is not very hard, but verifying it is much harder and we want to ensure that Rust's type system is future-proof and long standing. Some of them may happen if it really is useful and it is proved "correct" for our purpose, but not immediately.
I'm not 100% sure of what I'm saying but it kinda makes sense.
There's an other concept coming into play: reachability analysis. The compiler knows that what follows a return
expression statement is unreachable. For example, if we compile this function:
fn test() -> i32 {
return 1;
2
}
We get the following warning:
warning: unreachable expression
--> src/main.rs:3:5
|
3 | 2
| ^
|
The compiler can ignore the "true" branch of the if
expression if it ends with a return
expression and only consider the "false" branch when determining the type of the if
expression.
You can also see this behavior with diverging functions. Diverging functions are functions that don't return normally (e.g. they always fail). Try replacing the return
expression with the fail!
macro (which expands to a call to a diverging function). In fact, return
expressions are also considered to be diverging; this is the basis of the aforementioned reachability analysis.
However, if there's an actual ()
expression after the return
statement, you'll get an error. This function:
fn print_number(x: i32, y: i32) -> i32 {
if x + y > 20 {
return x;
()
} else {
x + y
}
}
gives the following error:
error[E0308]: mismatched types
--> src/main.rs:4:9
|
4 | ()
| ^^ expected i32, found ()
|
= note: expected type `i32`
found type `()`
In the end, it seems diverging expressions (which includes return
expressions) are handled differently by the compiler when they are followed by a semicolon: the statement is still diverging.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With