I had created the text semantic search engine. However, I cannot find the data set which is labeled so that I can evaluate the information retrieve of my system.
Is there any public available document (text) which is labeled. As I would need the text document to evaluate the information retrieve result. (recall, precision, F1 value...)
Thanks.
I do research in this direction. In all my research, i have used AOL dataset which consists of ~20M web queries collected from ~650k users over three months (March 01, 2006 to May 31, 2006). The data is sorted by anonymous user ID and sequentially arranged.
The data set includes {AnonID, Query, QueryTime, ItemRank, ClickURL}. More details can be found in the link mentioned above. I am interested to know how you have implemented and if possible, share your engine's code. I am also interested to know the performance on AOL dataset in your search engine.
You can find the dataset in my git repository. Thanks!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With