I am desperately trying to fill a matrix with values from a data frame. It is trade data, so the data frame looks something like this:
country1 country2 value
1 Afghanistan Albania 30
2 Afghanistan Albania 81
3 Afghanistan China 5
4 Albania Germany 6
5 China Germany 8
6 China Turkey 900
7 Germany Turkey 12
8 Germany USA 3
9 Germany Zambia 700
Using the unique and sort commands I have created a list of all countries that occur in the df (and converted it to a matrix):
countries_sorted
[1,] "Afghanistan"
[2,] "Albania"
[3,] "China"
[4,] "Germany"
[5,] "Turkey"
[6,] "USA"
[7,] "Zambia"
Using this "list", I have created an empty trade matrix (7x7):
Afghanistan Albania China Germany Turkey USA Zambia
Afghanistan NA NA NA NA NA NA NA
Albania NA NA NA NA NA NA NA
China NA NA NA NA NA NA NA
Germany NA NA NA NA NA NA NA
Turkey NA NA NA NA NA NA NA
USA NA NA NA NA NA NA NA
Zambia NA NA NA NA NA NA NA
I am now hopelessly failing to fill this matrix with the numbers/sums from the value column of df. I have tried something like this:
a<-cast(df, country1~country2 , sum)
which works to a degree BUT the matrix does not retain its original 7x7 format, which is what I need to have a matrix where the diagonal is all 0s.
> a
country1 Albania China Germany Turkey USA Zambia
1 Afghanistan 111 5 0 0 0 0
2 Albania 0 0 6 0 0 0
3 China 0 0 8 900 0 0
4 Germany 0 0 0 12 3 700
Please, anyone with a solution????
There are three ways of creating an empty matrix:Using row and column. Using only row. Using only column.
Data frames with matrix columns are a very useful solution to this situation. The posterior stays in a matrix that has the same number of rows as the data frame. But that matrix only is recognized as a single "column" in the data frame, and referring to that column using df$mat will return the matrix.
A data frame contains a collection of “things” (rows) each with a set of properties (columns) of different types. Actually this data is better thought of as a matrix1. In a data frame the columns contain different types of data, but in a matrix all the elements are the same type of data.
We use function rbind() to add the row to any existing matrix. To know rbind() function in R simply type ? rbind() or help(rbind) R studio, it will give the result as below in the image.
Starting with these 2 data sets:
#your data.frame
df <- read.table(header=T, file='clipboard', stringsAsFactors = F)
#the list of unique countries
countries <- unique(c(df$country1,df$country2))
You could do:
#create all the country combinations
newdf <- expand.grid(countries, countries)
#change names
colnames(newdf) <- c('country1', 'country2')
#add a value of 0 for the new combinations (won't affect outcome)
newdf$value <- 0
#row bind with original dataset
df2 <- rbind(df, newdf)
#and create the table using xtabs:
#the aggregate function will create the sum of the value for each combination
> xtabs(value ~ country1 + country2, aggregate(value~country1+country2,df2,sum))
country2
country1 Afghanistan Albania China Germany Turkey USA Zambia
Afghanistan 0 111 5 0 0 0 0
Albania 0 0 0 6 0 0 0
China 0 0 0 8 900 0 0
Germany 0 0 0 0 12 3 700
Turkey 0 0 0 0 0 0 0
USA 0 0 0 0 0 0 0
Zambia 0 0 0 0 0 0 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With