2014年10月31日星期五

Summary of Recommendation System's Algorithms

On the lecture 7, Rosanna introduced the details of recommendation system to us. After class, I browsed the Internet and summarized the category of mainstream algorithms.

Firstly, the item-based collaborative filtering and user-based collaborative filtering, which have been taught in the lecture by Rosanna, are the core algorithms used by the large web sites, such as Amazon.com and JD.com. Shown as Figure 1, the process of collaborative filtering (CF) includes two main steps, the forecasting and recommending. The disadvantage of Item-based CF is the lack of diversity because the item are alike. However,     since the group of similar users is very sensitive, the drawback of the user-based CF is that it should calculate the matrix of similar users frequently so that the computation will be very big.

Figure 1 the process of collaborative filtering

Secondly, the content-based algorithm, which is basically depending on the text mining. For example, we can extract the keywords of two texts which you want to analyze and calculate the frequencies of the key wards. After that, you can calculating the texts’ similarity. The strength of this algorithm is no data is sparse.

The third algorithm I want to introduce is that K-Nearest Neighbor (KNN). K-NN is a type of instance-based learning and the K-NN algorithm is among the simplest of all machine learning algorithms. Both for classification and regression, the algorithm is used to calculate the contributions of the neighbors which means the nearer neighbors contribute more to the average than the more distant ones.

What’s more, the Slope One algorithm is another algorithm that I want to introduce. The slop one algorithm, in my opinion, is an easy method to fill the blanket places in the user-item rating metrics. For example, as the Figure 2, User X and Y have given marks to item1 and item2 while User A just rated to item1. So, what should the mark User A gives to item2? By using the Slope One, the mark should be: 4 - ((5-3) + (4-3))/2 = 2.5. This algorithm is very simple so the short coming is that the item the system recommends to you is common but not individual.

Figure2 example for the Slope One

Reference:
1. http://blog.csdn.net/huagong_adu/article/details/7362908
2. http://zh.wikipedia.org/wiki/Slope_one

3. http://blog.csdn.net/pi9nc/article/details/9068437

6 条评论:

  1. After reading this blog, I got a better understanding of recommendation system, algorithm is most important, I learned a lot about item-based collaborative filtering and user-based collaborative filtering from your blog, also, the picture is good for understanding.

    回复删除
  2. Your blog was well written, especially the part of collaborative filtering. The description and explanation that you talked about were brief and to the point. I indeed gained a lot by reading your blog.

    回复删除
  3. There are a lots of ways for implementing the recommender system. Thanks a lots for your sharing on the summarize of the recommend algorithm.

    回复删除
  4. Nice blog! I am more familiar with the recommendation system. May I know some practical use in social media nowadays? Thanks!

    回复删除
  5. After the abusing of cookie. I think the fundamental data mining is more and more important. Now, I choose to shut off my browser's cookie setting. How to get the user behavior without cookie is a hard work to handle.

    回复删除
  6. Thanks for your sharing! The content is very informative and clear. Recommender system is a very important part of an e-commerce platform. A recommender system that "understands" customers better can really help imporve the sales. :D

    回复删除