Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Inter annotator agreement when users annotates more than one category for any subject

I want to find the inter annotator agreement for few annotators. Annotators annotates few categories (out of 10 categories) for each subjects.

For e.g. there are 3 annotator , 10 categories and 100 subjects .

I am aware about http://en.wikipedia.org/wiki/Cohen's_kappa (For two annotators) and http://en.wikipedia.org/wiki/Fleiss%27_kappa (for more than two annotators) inter annotator agreement but I realized that they may not work if user annotates more than one category for any subject.

Do anyone has any idea for determining inter annotation agreement in this scenario.

Thanks

like image 409
piku Avatar asked Feb 01 '26 21:02

piku


2 Answers

i had to do this several years back. i cant recall how exactly i did it(i dont have code anymore) but i have a worked example to report to my professor. i was dealing with annotation of comments and have 56 categories and 4 annotators.

note:at the time i need a way to detect where annotators most disagree so that after each annotation session they can focus on why they disagree and set out reasonable rules to maximize this statistic. it worked well for that purpose

Let's assume A-D are annotators and 1-5 are categories. This is a possible scenario.

     A      B      C    D     Probability of agreement
1    X      X      X    X        4/4
2    X      X      X             3/4
3    X      X                    2/4
4    X                           1/4
5 

A tags this comment as 1,2,3,4 B->1,2,3, and so forth. 

For each category the probability of agreement is calculated. 

Which is then divided by the number of unique categories tagged for that particular comment.

Therefore for the example comment, we have 10/16 as annotator's agreement. This is a value between 0 and 1. 

if this doesnt work for you then (http://www.mitpressjournals.org/doi/pdf/10.1162/coli.07-034-R2) pg-567, which was referenced by pg-587 case study.

like image 120
Zaw Lin Avatar answered Feb 04 '26 15:02

Zaw Lin


Compute agreement on a per-label basis. If you treat one of the annotators as the gold standard, you can then compute recall and precision on label assignments. Another option is label overlap, which would be the proportion of subjects where either annotator assigned a category where the both assigned it (intersection over union).

like image 28
Ben Allison Avatar answered Feb 04 '26 16:02

Ben Allison



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!