Wednesday, April 27, 2011

Architecture of Recommendation Systems

About?
I recently read a paper which talks about the high level architecture of recommendation systems. The topic was discussed in a very generic way and not related to any specific domain. Details on how the database interaction happens, or how statistics collected are represented in the Database were not discussed. However, the paper was good enough to discuss the components of any recommendation system. As expected, the architecture was very clean and each component had a purpose and did not mess up with other component and that's the reason I am discussing the architecture here.

What is it?
Recommendation system is one which identifies patterns in user behavior and makes some predictions based on them. For example, how does Facebook know who are likely to be your friends? How does Amazon know which product you are likely to buy? And how does Google know which Add to show to you? All of them are recommendation systems which try to understand your behavior. Sometimes, the rules of the recommendation systems are hard-coded and hence are static, and sometimes they are very intelligent and dynamically change the rules themselves.

Architecture of a Recommendation System
The top most layer of recommendation system is always the User interface along with privacy/security layer, and the bottom layer is the database to persist your findings. What matters is the layer in between. If you observe well, you can easily find out 3 different components working together.

1. Watcher: This component takes care of collecting patterns from the user behavior. It might either do it in a visible way by posting surveys and getting feedbacks, etc, or it might choose to hide its identity and silently do its job by tracking the page hit count or the saving links the user clicks etc. This watcher component should come up with an optimal way to represent the data in the database. It need not hit the DB all the time, as the loss of some data is tolerable in such systems. So it can cache the data in its low local machine and update the central database periodically.

2. Learner: This guy always polls the Database for updates from the Watcher and interprets the data. Remember that though the user behavior is captured by the Watcher, Learner is the one that makes some sense out of it. Learners can be very complex by having multiple heuristics and can also rank/weigh each of them differently. It can also give more weight to most recent data. It can also predict the expertise of the user from the user behavior and provide different weight to different level of expertise.

3. Advisor: This component is visible to the user, more likely a GUI shown by HTML or Swing and it shows all recommendations that learner has predicted. The Advisor need not be always a dumb system, but can again collect details on the quality of recommendation like the number of times the user has accepted a recommendation and the number of times the user has said the recommendation was useless (Again it can be either visible or invisible to the user). It can provide the details of the this input to the learner system which in turn can adjust its weight on each heuristic and hence can keep learning.

No comments:

Post a Comment