Showing posts with label Architecture. Show all posts
Showing posts with label Architecture. Show all posts

Wednesday, April 27, 2011

Architecture of Recommendation Systems

About?
I recently read a paper which talks about the high level architecture of recommendation systems. The topic was discussed in a very generic way and not related to any specific domain. Details on how the database interaction happens, or how statistics collected are represented in the Database were not discussed. However, the paper was good enough to discuss the components of any recommendation system. As expected, the architecture was very clean and each component had a purpose and did not mess up with other component and that's the reason I am discussing the architecture here.

What is it?
Recommendation system is one which identifies patterns in user behavior and makes some predictions based on them. For example, how does Facebook know who are likely to be your friends? How does Amazon know which product you are likely to buy? And how does Google know which Add to show to you? All of them are recommendation systems which try to understand your behavior. Sometimes, the rules of the recommendation systems are hard-coded and hence are static, and sometimes they are very intelligent and dynamically change the rules themselves.

Architecture of a Recommendation System
The top most layer of recommendation system is always the User interface along with privacy/security layer, and the bottom layer is the database to persist your findings. What matters is the layer in between. If you observe well, you can easily find out 3 different components working together.

1. Watcher: This component takes care of collecting patterns from the user behavior. It might either do it in a visible way by posting surveys and getting feedbacks, etc, or it might choose to hide its identity and silently do its job by tracking the page hit count or the saving links the user clicks etc. This watcher component should come up with an optimal way to represent the data in the database. It need not hit the DB all the time, as the loss of some data is tolerable in such systems. So it can cache the data in its low local machine and update the central database periodically.

2. Learner: This guy always polls the Database for updates from the Watcher and interprets the data. Remember that though the user behavior is captured by the Watcher, Learner is the one that makes some sense out of it. Learners can be very complex by having multiple heuristics and can also rank/weigh each of them differently. It can also give more weight to most recent data. It can also predict the expertise of the user from the user behavior and provide different weight to different level of expertise.

3. Advisor: This component is visible to the user, more likely a GUI shown by HTML or Swing and it shows all recommendations that learner has predicted. The Advisor need not be always a dumb system, but can again collect details on the quality of recommendation like the number of times the user has accepted a recommendation and the number of times the user has said the recommendation was useless (Again it can be either visible or invisible to the user). It can provide the details of the this input to the learner system which in turn can adjust its weight on each heuristic and hence can keep learning.

Tuesday, April 12, 2011

Slab allocator


Situation:
Traditional memory allocators have always seen memory as a sequence of bytes available for use. From the layers point of view, memory allocators are below the 'Object' layer and so they have no idea on what kind of object is going to use the memory. In a way it is good, as the implementation of memory allocator becomes simpler. However, it also has a drawback. There are some object whose basic structure doesn't get changed in the course of the execution and so can be reused instead of them going through creation/destruction cycle repeatedly. Also, when the object's initialization becomes very costly, they should be reused as far as possible.

Object caching:
It makes sense for a memory allocator to understand the concept of objects so that memory can be allocated in a better way. So let us introduce a new layer above the usual memory allocator layer which is an object cache. Each type of object has its own cache. When the client requests a cache to allocate an object, the cache checks whether there are any free objects and if there are none, it asks the VM to allocate memory and creates an object and returns it. From the next time on, when the same object becomes free (i.e. when the client releases the object), object creation/initialization part is avoided and the same object is reused from the cache. If the memory manager asks the cache to free up some memory so that it can be reused by other systems, deallocate some objects and return some memory to it. On a cautious note, the object cache layer shouldn't waste memory and should return any unwanted memory to the underlying memory manager when memory manager asks for it.

Client and the central allocator:
Let us call the object caching pool the central allocator. We can split the system up into a client layer which should understand the object completely (including its name, size, how it should be constructed and destructed, its alignment etc) and requests the cache to create an object. The central allocator should take care of the underlying memory management and object caching thereby simplifying the interface for the clients.

Slab allocator:
The central allocator grows or shrinks by 1 slab size. A slab can be visualized as a contiguous memory split into equal-size chunks with a reference count on how many of those chunks are in use right now. To allocate/free a memory, just update the reference count, and return memory from the slab that has extra space. If the reference count goes to zero, the cache can shrink itself by the size of one slab. Slab allocator also reduces internal and external fragmentation issues as it already knows the size of the object it is always going to allocate.

Architecture:
A cache is always going to have similar objects only. It grows or shrinks by the size of a slab. So the cache maintains a circular doubly linked list to go to all slabs it manages right now (the node is called kmem_slab also has a reference count so that it can be deallocated later). Internally a slab is a contiguous memory whose size might span across pages. To know which objects are free to use right now, each slab maintains an additional free-list (the node is called kmem_bufctl which has the buffer address and a back pointer to the kmem_slab node). When the object size becomes very small, maintaining a free-list becomes an overhead as it occupies a lot of memory. So the kmem_slab and kmem_bufctl are all maintained at the end of the slab memory itself. Also kmem_bufctl wont be a linked list anymore but just bit vector showing whether an object in slab is in use or not.

Working of a Slab allocator:
The slabs are arranged in a particular order based on its usage; i.e. all free slabs are maintained at the end of the cache's DLL before which comes the partially used slab. The cache internally has a free-list pointer pointing to the first non-empty slab from which it can start allocating objects whenever a client places a request (remember, it also updates the reference count of the slab). The non-empty slab will then return the object from its buffer.
When an object is being returned by the client, if the reference count of the corresponding slab becomes zero, it is appended to the tail of the cache's free list. When the system runs low on memory, it will demand the slab allocator to release some memory. The slab allocator in turn releases some of its free slabs which have not been used recently. It doesn't release all free lists to avoid thrashing.

Usage:
Slab allocator are being used by many operating systems including Linux and Solaris, as it has been proven to work very efficiently. Memcached also uses this to avoid internal fragmentation.

Saturday, April 2, 2011

Distributed Hash tables

Hash tables
Hash tables are efficient data structures which leverage memory to obtain speed. All its operations are optimized for constant time. However, as everyone knows, the internal implementation of hash table decides how efficient the hash table can work. A poorly designed hash function will cause more collisions, causing a higher cost for both inserts and searches. Also, a wrong hash function might waste a lot of memory by not evenly distributing the keys into the slots. To get a clear picture of how to determine hash functions, you can use "Introduction to Algorithms" which gives a detailed study on this topic.

Problems with Hash table
I don't know a data structure which works for all problems in all situations. Any data structure has its own set of limitations. It is true for hash table also. Unlike other data structures like list, stack, queue, tree, etc the size of hash table is not determined by the amount of the data being stored. At the same time, if you want to increase the size of the Hash table, you have no option other than hashing all the keys into the newer table. Also, for a machine, the RAM size has some physical limits based on the machine's addressing capabilities. So in case the hash-table is maintained in-memory (as in most of the cases), the size that your hash table can occupy is has a fixed upper limit. What if you want to break that hard limit and go beyond that size?

DHT - Solution for a larger Hash table
There are so many reasons why you want a large hash table. As you might know, hash tables are very good data-structures for caching the results as it enables very fast look-ups. There is no way that you can go beyond the size limitations of a single machine. So if you want a larger hash table, the only option is to go for distributed Hash table, which means you should distribute your data across multiple hash tables in multiple machines. Many a times, scaling up is not a good solution. Scaling up means replacing the machine with a higher end machine which might have high power CPU, a large memory, etc. Scaling up is always costly and might be prone to single point failure. Scaling out is a better choice. It means, instead of having one machine, you maintain a bunch of machines which are of commodity type. They are cost efficient, at the same time, you should always accommodate fault tolerance in your architecture as you are using cheap commodity machines.

Architecture
Given that you go for a distributed hash table, how would you hash a key? The first step would be to identify where the key should go i.e. the machine where the key should be hashed to. Then as the second step in that machine, you should save the key using the normal hashing technique. So instead of applying a single hash, you should start applying 2 levels of hashes. The first to identify the machine and the second to save/look up the key in the specific machine you got in the first step. The architecture thus becomes natural. The duty of the first component thus becomes to identify which machine to use. The second component has a list of independent machines. Each are simple implementation of hash table and are ignorant of the fact that outside itself there are multiple cousins doing the same functionality.

Practical application
Memcached is a stunning implementation of the above said technique. It is a cache layer over the database and is expected to reduce the number of database look-ups. It is being used by a number of very active websites. Facebook is one such site which uses Memcached and has a hit-rate of around 99%.