Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: reading in .csv file removes leading zeros

I realize that reading a .csv file removes the leading zeros, but for some of my files, it maintains the leading zeros without my having to explicitly set colClasses in read.csv. On the other hand, what's confusing me is in other cases, it DOES remove the leading zeros. So my question is: in which cases does read.csv remove the leading zeros?

like image 859
user3755880 Avatar asked Jul 14 '15 15:07

user3755880


People also ask

Why do leading zeros disappear in CSV?

This is actually an Excel issue. The program automatically truncates all leading zeros from numbers in CSV files. The key is to change at least the columns where the leading zeros occur (i.e. ORG or Fund numbers) to "text." There are two ways to accomplish this.

How do you keep the leading zero in R?

If all the data in the column are of the same length, you can do paste0("0", NAME) . If variable length, try formatC like so: formatC(NAME, width = 2, format = "d", flag = "0") .


1 Answers

The read.csv, read.table, and related functions read everything in as character strings, then depending on arguments to the function (specifically colClasses, but also others) and options the function will then try to "simplify" the columns. If enough of the column looks numeric and you have not told the function otherwise, then it will convert it to a numeric column, this will drop any leading 0's (and trailing 0's after the decimal). If there is something in the column that does not look like a number then it will not convert to numeric and either keep it as character or convert to a factor, this keeps the leading 0's. The function does not always look at the entire column to make the decision, so what may be obvious to you as not being numeric may still be converted.

The safest approach (and quickest) is to specify colClasses so that R does not need to guess (and you do not need to guess what R is going to guess).

like image 68
Greg Snow Avatar answered Sep 22 '22 22:09

Greg Snow