Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mlogit gives error: the two indexes don't define unique observations

Tags:

r

mlogit

My dataframe named longData looks like:

  ID Set Choice Apple Microsoft IBM Google Intel HewlettPackard Sony Dell Yahoo Nokia
1  1   1      0     1         0   0      0     0              0    0    0     0     0
2  1   2      0     0         1   0      0     0              0    0    0     0     0
3  1   3      0     0         0   1      0     0              0    0    0     0     0
4  1   4      1     0         0   0      1     0              0    0    0     0     0
5  1   5      0     0         0   0      0     0              0    0    0     0     1
6  1   6      0    -1         0   0      0     0              0    0    0     0     0

I am trying to run mlogit on it by:

logitModel = mlogit(Choice ~ Apple+Microsoft+IBM+Google+Intel+HewlettPackard+Sony+Dell+Yahoo+Nokia | 0, data = longData, shape = "long")

it gives the following error:

Error in dfidx::dfidx(data = data, dfa$idx, drop.index = dfa$drop.index,  : 
  the two indexes don't define unique observations

after looking for some time I found that this error was given by dfidx as seen in here as:

z <- data[, c(posid1[1], posid2[1])]
if (nrow(z) != nrow(unique(z)))
    stop("the two indexes don't define unique observations")

but upon calling the following code, it runs without the error and gives the names of two idx that are uniquely able to identify a row in dataframe:

dfidx(longData)$idx

this gives expected output as:

~~~ indexes ~~~~
   ID Set
1   1   1
2   1   2
3   1   3
4   1   4
5   1   5
6   1   6
7   1   7
8   1   8
9   1   9
10  1  10
indexes:  1, 2 

So what am I doing wrong, I saw some related questions 1, 2 but couldn't find what I am missing.

like image 633
Ankit Agrawal Avatar asked Nov 15 '22 06:11

Ankit Agrawal


1 Answers

It looks like your example comes from here: https://docs.displayr.com/wiki/MaxDiff_Analysis_Case_Study_Using_R

The code seems outdated, I remember it worked for me, but not anymore.

The error message is valid because every pair (ID, Set) appears several times, once for each alternative.

However this works:

# there will be complaint that choice can't be coerced to logical otherwise
longData$Choice <- as.logical(longData$Choice)
# create alternative number (nAltsPerSet is 5 in this example)
longData$Alternative <- 1+( 0:(nrow(longData)-1) %% nAltsPerSet)
# define dataset
mdata <- mlogit.data(data=longData,shape="long", choice="Choice",alt.var="Alternative",id.var="ID")
# model
logitModel = mlogit(Choice ~ Microsoft+IBM+Google+Intel+HewlettPackard+Sony+Dell+Yahoo+Nokia | 0,
                    data = mdata
)

summary(logitModel)

like image 129
Maciej Witkowiak Avatar answered Jun 01 '23 12:06

Maciej Witkowiak