Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does "Stream did not contain valid UTF-8" mean?

Tags:

stream

utf-8

rust

I'm creating a simple HTTP server. I need to read the requested image and send it to browser. I'm using this code:

fn read_file(mut file_name: String) -> String {
    file_name = file_name.replace("/", "");
    if file_name.is_empty() {
        file_name = String::from("index.html");
    }

    let path = Path::new(&file_name);
    if !path.exists() {
        return String::from("Not Found!");
    }
    let mut file_content = String::new();
    let mut file = File::open(&file_name).expect("Unable to open file");
    let res = match file.read_to_string(&mut file_content) {
        Ok(content) => content,
        Err(why) => panic!("{}",why),
    };

    return file_content;
}

This works if the requested file is text based, but when I want to read an image I get the following message:

stream did not contain valid UTF-8

What does it mean and how to fix it?

like image 269
Saeed M. Avatar asked Mar 30 '17 15:03

Saeed M.


1 Answers

The documentation for String describes it as:

A UTF-8 encoded, growable string.

The Wikipedia definition of UTF-8 will give you a great deal of background on what that is. The short version is that computers use a unit called a byte to represent data. Unfortunately, these blobs of data represented with bytes have no intrinsic meaning; that has to be provided from outside. UTF-8 is one way of interpreting a sequence of bytes, as are file formats like JPEG.

UTF-8, like most text encodings, has specific requirements and sequences of bytes that are valid and invalid. Whatever image you have tried to load contains a sequence of bytes that cannot be interpreted as a UTF-8 string; this is what the error message is telling you.


To fix it, you should not use a String to hold arbitrary collections of bytes. In Rust, that's better represented by a Vec:

fn read_file(mut file_name: String) -> Vec<u8> {
    file_name = file_name.replace("/", "");
    if file_name.is_empty() {
        file_name = String::from("index.html");
    }

    let path = Path::new(&file_name);
    if !path.exists() {
        return String::from("Not Found!").into();
    }
    let mut file_content = Vec::new();
    let mut file = File::open(&file_name).expect("Unable to open file");
    file.read_to_end(&mut file_content).expect("Unable to read");
    file_content
}

To evangelize a bit, this is a great aspect of why Rust is a nice language. Because there is a type that represents "a set of bytes that is guaranteed to be a valid UTF-8 string", we can write safer programs since we know that this invariant will always be true. We don't have to keep checking throughout our program to "make sure" it's still a string.

like image 76
Shepmaster Avatar answered Nov 21 '22 21:11

Shepmaster