Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Suggest what user could buy if he already has something in the cart

I am developing e-shop where I will sell food. I want to have a suggestion box where I would suggest what else my user could buy based on what he's already have in cart. If he has beer, I want him to suggest chips and other things by descending precentage of probability that he'll buy it too. But I want that my algorithm would learn to suggest groceries based on the all users' previous purchases. Where should I start? I have groceries table user_id, item_id, date and similar. How can I make a suggestion box without brute-forcing which is impossible.

like image 400
good_evening Avatar asked Aug 27 '12 22:08

good_evening


4 Answers

The thing you're describing is a recommendation engine; more specifically collaborative filtering. It's the heart of Amazon's "people who bought x also bought y" feature, and Netflix's recommendation engine.

It's a non-trivial undertaking. As in, to get anything that's even remotely useful could easily take more than building the ecommerce site in the first place.

For instance:

  • you don't want to recommend items that are already in the basket.
  • you don't want to recommend cheaper versions of the things that are already in the basket.
  • you don't want to recommend items that are out of stock.
  • you don't want to recommend items that are statistically valid, but make no sense ("hey, you bought nappies, why not buy beer?" - there is a story that in supermarkets, there is a statistical correlation because dads go out at night to buy nappies and pick up a six pack at the same time).
  • you do want to recommend items that are in a promotion right now
  • you don't want to recommend items that are similar to items in a promotion right now

When I tried a similar project, it was very hard to explain to non-technical people that the computer simply didn't understand that recommending beer alongside nappies wasn't appropriate. Once we got the basic solution working, building the exclusion and edge case logic took at least as long.

Realistically, I think these are your options:

  • manually maintain the related products. Time consuming, but unlikely to lead to weirdness.
  • use an off-the-shelf solution - either SaaS or include a library like R which supports this.
  • recommend (semi)random products. Have a set of products you want to recommend, and pick one at random - for instance, products on promotion, products which are in the "best seller" list, products which cost less than x. Exclude categories that could be problematic.

All those options are achievable in reasonable time; the problem with building a proper solution from scratch is that everyone will measure it against Amazon, and they've got a bit of a head start on you...

like image 187
Neville Kuyt Avatar answered Oct 14 '22 11:10

Neville Kuyt


This is a common problem solved by Apriori Algorithm in Data Mining. You may need to create another table which maintains this statistics and then suggest based on the preferred combination

like image 21
Cid Avatar answered Oct 14 '22 11:10

Cid


Humm... you are looking for a product recommendation engine then... Well, they come, basically, in three flavours:

  • Collaborative filtering
  • Content-based filtering
  • Hybrid recommender systems

   The first one gathers and stores data on your users' activities, preferences, behavior, etc... This data is then sent into an engine that separates it into user channels. Each channel has certain characteristic likes and dislikes. So, when you have a new visitor he or she will be classified and be assiged an specific user profile. Then items will be displayed based on this profile's likes/dislikes.

   Now, content-based filtering uses a different approach - a less social one - by taking into account ONLY your user's previous browsing history, his preferences and activities. Essentially, this will create recommendations based on what this user has previously liked/purchased.

   But why choose just one of them, right? Hybrid recommender systems uses a bit of both to provide a personalized yet social recommendation. These are usually more accurate when it comes to providing recommendations.

   I think that the collaborative filtering is a great option when you have a big influx of users - it's kinda hard to build good channels with only 42 users/month accessing your website. The second option, based on content, is better for a small site with plenty of products - however, IMHO, the third one is the one for you - build something that will get users going from the start and gather all that data they generate to, in the future, be able to offer a amazon-like recommendation experience!

   Building one of these is no easy task as I'm sure you already know... but I strongly recommend this book (using a personal-history filtering!) which has really came through for me in the past: http://www.amazon.com/Algorithms-Intelligent-Web-Haralambos-Marmanis/dp/1933988665

Good luck and good learning!

like image 6
lleite Avatar answered Oct 14 '22 12:10

lleite


I think the best approach is to categorize your items and use that information to make the choice.

I did this on a grocery website and the results worked quite well. The idea is to cross group items into a number of categories.

For example, lets take a banana. It's a fruit, but it is also commonly used with cornflakes or cereal for breakfast. Cereals are also a breakfast food but certain ones might be considered health foods while others are sugary treats.

With this sort of approach, you can quickly start making a table like this:

Item         | Category
-------------+------------
Banana       | Breakfast
Banana       | Quick
Banana       | Fruit
Banana       | Healthy
Museli       | Breakfast
Museli       | Healthy
Sugar Puffs  | Breakfast
Sugar Puffs  | Treat
Kiwi Fruit   | Fruit
Kiwi Fruit   | Healtyh
Kiwi Fruit   | Dessert
Milk         | Breakfast

With a simple lookup like this, you can easily find good items to suggest based on these groupings.

Lets say someone's basket contains a Banana, Museli and Sugar Puffs.

That's three breakfast items, two healthy, one not so much.

Suggest Milk as it matches all three. No impulse buy? Try again, throw in a Kiwi Fruit. and so on and so on.

The idea here is to match items across many different categories (especially ones that may not be directly apparent) and use these counts to suggest the best items for your customer.

like image 5
Fluffeh Avatar answered Oct 14 '22 11:10

Fluffeh