Cashing has always been used in many processes outside WAN optimization too. Network devices and computer processors cache important data so that they need not always be retrieved from the memory every time. If you are accessing this page through a web-browser, and you return to this page after some time, there is a good chance that the page will be re-loaded from your browser cache and a request may not be made to the original server where this content is stored. But some browsers do check if there has been any change in the content, with the servers before loading them from their cache.
In a big organization, there are many files and templates that are frequently accessed by a large section of the employees. So, when a company has a centrally consolidated data centre, there may be a large number of such requests going to the servers in the data centre to access data. For example, all the users might be accessing the corporate website. So, every time there is a request, the web page needs to be downloaded from the servers in the data centre which takes more time and bandwidth. WAN Acceleration devices sit at the edge of such enterprise networks and analyse the requests that goes from the network. If it finds that a large number of users are requesting the same content (or parts of the same content is similar), then it stores that content (or parts of it – like a logo picture for example) in its local memory so that it can be delivered to the users in the LAN whenever they request for the same. This process is called object caching. This is also called proxy caching and useful for accelerating access to http, https and ftp content. Of course, if the requested object is not there in the cache memory, then the request is forwarded to the server. One huge advantage of object caching is server offloading – lesser requests to the server, in addition to the time and bandwidth saved.
Byte caching is another method of caching that usually has a pair of caching appliances working together – one at the branch level and another at the data centre level. Information streams are stored at either end and when ever a data segment being sent or received is similar to another data segment that is already present in the memory of both the cache appliances, that data stream is removed and a small tag is used in its place which the other appliance can recognise and replace with the original data stream. Some advantages of this method are: It works in the transport layer of the OSI model and hence is application independent. It just looks for full or part of data that is repeating itself, irrespective of the application or protocol. This works both ways for WAN traffic optimization – user downloading data from servers and also users uploading data to the servers.
A combination of the above two types of caching could also be effective.
Challenges facing Caching mechanisms:
The obvious problem associated with object caching is the frequent updation of website content. The home pages of most of the websites keep changing with time. Even some of the individual pages might change after some time. So, a proxy may need to check with the server if there are any updates for a particular web page requested by a user. And if there is any change, the new content needs to be downloaded to the cache memory and then given to the user. If there is no change, then the previously stored content may be sent to the user and the whole process might be limited within the LAN. This does increase the speed of loading of web pages and reduce the bandwidth required to load such web pages but it works only for those web pages that are frequently requested by the users. But there are many web pages that are frequently requested by many users of an organization.
There is another approach used by the caching appliances. They check frequently with the servers (even if a user is not requesting any content) for any updates/changes for the content that has been stored in the caching appliances. While this makes it faster for the users, the frequent checking for updation would itself consume a lot of bandwidth and create an additional resource crunch.
The storage required for caching such frequently required data (In addition to web pages, files, logos, images etc. are also frequently accessed from the central data centres) is also quite limited. So, there is always a limit to how much such content can be stored in the caching appliances. The content which is not frequently accessed or which is stored for a very long time might be discarded automatically. There are algorithms which determine which content to store and which content to discard based on frequency, longevity etc. There is also a consideration of the type of system that is used to index such stored data. Some caching appliances use file systems but some of them use object access through a hashed table in RAM which would enable any object to be obtained in a single read.
You can stay up to date on the various computer networking technologies by subscribing to this blog with your email address in the sidebar box that says ‘Get email updates when new articles are published’