Suppose I have the following data frame named DF
. I would like to convert all the values in the Revenue
column to the same unit.
Brands Revenue
A 50.1 bn
B 41.2 bn
C 32.5 Mn
D 15.1 bn
Please note that bn
and Mn
are part of the vectors.
1 billion = 1000 million. So, to convert any number of millions into billions, you have to simply multiply the given value of million by 0.001 billion.
One billion equals 1,000,000,000, i.e. one thousand million, and on the short scale, we write this as 109 (ten to the ninth power).
So, the answer to the question "what is 84 billions in millions?" is 84000 million.
One billion can be written as b or bn. The value of 1 billion is ten thousand lakhs in the Indian numeral system. In terms of crores, 1 billion is equivalent to 100 crores, i.e. 1 bn (1 b) = 1,000,000,000.
One idea,
new <- ifelse(gsub('.*\\s+', '', DF$Revenue) == 'bn',
as.numeric(gsub('[A-Za-z]', '', DF$Revenue))*1000, DF$Revenue)
new[!grepl('Mn', new)] <- paste(new[!grepl('Mn', new)], 'Mn', sep = ' ')
DF$Revenue <- new
DF
# Brands Revenue
#1 A 50100 Mn
#2 B 41200 Mn
#3 C 32.5 Mn
#4 D 15100 Mn
To do the opposite then,
new <- ifelse(gsub('.*\\s+', '', DF$Revenue) == 'Mn',
as.numeric(gsub('[A-Za-z]', '', DF$Revenue))/1000, DF$Revenue)
new[!grepl('bn', new)] <- paste(new[!grepl('bn', new)], 'bn', sep = ' ')
DF$Revenue <- new
DF
# Brands Revenue
#1 A 50.1 bn
#2 B 41.2 bn
#3 C 0.0325 bn
#4 D 15.1 bn
Another method: separate the monetary value from the text using split
:
# split value and "level" in a list
temp <- split(df$Revenue, split=" ")
# add separately to data.frame
df$Revenue <- sapply(temp, function(i) as.numeric(i[[1]]))
df$level <- sapply(temp, function(i) "[", 2)
df
Brands Revenue level
1 A 50100.0 bn
2 B 41200.0 bn
3 C 32.5 bn
4 D 15100.0 bn
Now, convert to millions subsetting on the levels with "bn":
df$Revenue[df$level == "bn"] <- df$Revenue[df$level == "bn"] * 1000
df$level <- "Mn"
This results in
df
Brands Revenue level
1 A 0.0501 Mn
2 B 0.0412 Mn
3 C 32.5000 Mn
4 D 0.0151 Mn
Instead convert to billions (a similar procedure)
df$Revenue[df$level == "Mn"] <- df$Revenue[df$level == "Mn"] / 1000
df$level <- "bn"
This results in
df
Brands Revenue level
1 A 0.0501 bn
2 B 0.0412 bn
3 C 32.5000 bn
4 D 0.0151 bn
To maybe simplify the parsing procedure compared to the previous solutions. I am using the awesome library stringr:
library(stringr)
dd$units <- word(dd$Revenue, 2, sep = " ")
dd$amounts <- word(dd$Revenue, 1, sep = " ")
# The following lines create an extra column in the dataframe,
# You can overwrite the original column if you so wish.
# Convert to billions
dd$convert_to_bn <- paste(as.numeric(dd$amounts) * ifelse(dd$units == "bn", 1 , 0.001), "bn")
# Convert to millions
dd$convert_to_mn <- paste(as.numeric(dd$amounts) * ifelse(dd$units == "Mn", 1 , 1000), "Mn")
This is a solution that replaces the "units" by appropriate factors and evaluates the resulting calculations.
The first step is to replace "bn" and "Mn" by a factor:
conversion <- c(Mn = 1/1000, bn = 1)
for (unit in names(conversion)) {
df$Revenue <- gsub(unit, paste0("*", conversion[unit]), df$Revenue)
}
df
## CBrands Revenue
## 1 A 50.1 *1
## 2 B 41.2 *1
## 3 C 32.5 *0.001
## 4 D 15.1 *1
Then evaluate the expressions in Revenue
and "bn" again:
df$Revenue <- sapply(df$Revenue, function(x) eval(parse(text = x)))
df$Revenue <- paste(df$Revenue, "bn")
df
## CBrands Revenue
## 1 A 50.1 bn
## 2 B 41.2 bn
## 3 C 0.0325 bn
## 4 D 15.1 bn
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With