Application Server Caching with SQL

On a high-volume web site, database access can become a bottleneck to overall web site performance. Because of the EJB structure, the database access required by the business logic of the application is increased (perhaps substantially) by the database access required to support entity bean/database synchronization. If the web site implements heavy personalization of its user interaction (i.e., if a high percentage of its pages are dynamically generated based on the profile of the particular user who is viewing them), then the database access load can be even higher. At the extreme, every click on a highly personalized web site could require retrieval of user-profile data from the database to drive page generation. Finally, user interaction with a web site happens in real time, and is affected by peak-load activity. The average rate of click processing is less important than peak-load activity in determining whether users perceive the site as fast or sluggish.

The World Wide Web has already shown an effective architecture for dealing with these types of peak-load Internet volume demands—through web page caching and horizontal scaling. With caching, copies of heavily accessed web pages are pulled forward in the network and replicated. As a result, the total network capacity for serving web pages is increased, and the amount of network traffic associated with those page hits is reduced. With horizontal scaling, web site content is replicated across two or more web servers (up to dozens or even hundreds of servers), whose aggregate capacity for serving pages is much greater than any single server.

Similar caching and horizontal scaling architectures are used to increase the capacity of application servers. Most commercial application servers today implement bean caching, where copies of frequently used entity beans are kept in the application server’s memory. In addition, application servers are often deployed in banks or clusters, with each application server providing identical business logic and application processing capability. In fact, many commercial application servers use horizontal scaling within a single server to take advantage of symmetric multiprocessing (SMP) configurations. It’s typical for an eight-processor application server to be running up to eight independent copies of the application server software, operating in parallel. Figure 22-6 shows a typical application server configuration with three four-processor servers.

Unfortunately, horizontal scaling and caching tend to work against one another when dealing with stateful data such as that stored in an entity bean or a database. Without special cache synchronization logic, updates made to a bean stored in the cache of one server instance will not automatically appear in the other caches, with the potential to cause incorrect results. Consider, for example, what happens to quantity- on-hand data if three or four separate caches contain copies of an entity bean for a single product and the business logic of the application server updates those values. The caches will very quickly contain different values for quantity on hand, none of which are accurate. The cache synchronization logic required to detect and prevent such a situation unfortunately carries with it a great deal of overhead. Absolute synchronization requires a full two-phase commit protocol (described in Chapter 23) among the caches.

Database caches can address the problems of multiple bean caches within a single SMP server, as shown in Figure 22-7. By caching at the database level instead of the bean level, one database cache provides consistency across all of the application server instances on a single server. Synchronization across multiple physical servers is still required, however. If the ratio of database reads to database updates is high (as, for example, in a highly personalized web site), the overhead of cache synchronization will remain relatively low and the benefits of horizontal scaling can be significant.

Oracle has used database caching within its own Oracle Application Server, and has attempted to use caching as a competitive advantage. IBM is naturally positioned to offer integrated database caching for its DB2 DBMS, but has not introduced such a capability at this writing. Several third-party products have been introduced as database caches for application servers, including products from several of the object-oriented database vendors and from in-memory database vendors. Whether database caching will substantially impact the application server market is still an open question.

Source: Liang Y. Daniel (2013), Introduction to programming with SQL, Pearson; 3rd edition.

Leave a Reply

Your email address will not be published. Required fields are marked *