SQL Database Access from Application Servers

The convergence of the application server market around the J2EE specification effectively standardized the external interface between the application server and a DBMS around JDBC. Conceptually, an application server can automatically access any database product that offers a JDBC-compliant API, thus achieving DBMS independence. In practice, subtle differences between the DBMS systems in areas like SQL dialects and database naming still require some tweaking and testing, and manifest themselves in subtle dependencies within the code deployed on the application server. However, these differences tend to be minor, and adjusting for them is relatively straightforward.

The approach to data management for the application code running on the application server is a slightly more complicated story. While the application server does provide uniform services for data management, it provides these in several different architectures, using the different types of EJBs in the J2EE specification. The application designer must choose among these approaches, and in some cases, will mix and match them to achieve the requirements of the application. Here are some of the decisions that must be made:

Will the application logic do direct database access from within a session bean, or will database contents be represented as entity beans, with database access logic encapsulated within them?
If direct access from session beans is used, can the session bean remain stateless (which simplifies the coding of the bean and its management by the application server), or does the logic of database access require the bean to be stateful, preserving a context from one invocation to another?
If entity beans are used to represent database contents, can the application rely on the container-managed persistence provided by the application server to manage database interaction, or does the application’s logic require that the entity bean provide its own database access logic through bean-managed persistence?
If entity beans are used to model database contents, do the beans correspond on a one-to-one basis to the tables in the underlying database (fine-grained modeling), or is it more appropriate for the beans to present a higher-level, more object-oriented view of the data, with the data within each bean drawn from multiple database tables (coarse-grained modeling)?

The trade-offs represented by these design questions provide an excellent perspective on the challenge of matching SQL and relational database technology to the demands of the World Wide Web and its stateless architecture, and the demands of application servers and object-oriented programming. The next several sections describe the basics of EJBs and the trade-offs among the different data access architectures they can support.

1. EJB Types

Within a J2EE-compliant application server, the user-developed Java applications code that implements the specific business logic is packaged and executes as a collection of EJBs. An EJB has a well-defined set of external interfaces (methods) that it must provide, and is written with an explicit set of class-specific public methods that define the external interface to the bean. The work done within the bean, and any private data variables that it maintains for its own use, can be encapsulated and hidden from other beans and from developers who do not need to know these internal details and should not write code that depends on them.

The EJBs execute on the application server within a container, which provides both a runtime environment for the beans and services for them. These range from general services, such as managing memory for the beans and scheduling their execution, to specific services like network access and database access (via JDBC). The container also provides persistence services, preserving the state of beans across activations.

EJBs come in two major types that are of interest from a data management perspective. The EJB types are graphically illustrated in Figure 22-3. The two major types of beans are:

Session beans. These beans represent individual user sessions with the application server. Conceptually, there is a one-to-one association between each session bean and a current user. In the figure, users Mary, Joe, and Sam are each represented by their own session bean. If there are internal instance variables within the bean, these variable values represent the current state associated with the user during this particular session.
Entity beans. These beans represent business objects, and logically correspond to individual rows of a database table. For example, for entity beans representing sales offices, there is a one-to-one association between each entity bean and a particular office, which is also represented in our sample database by a single row in the OFFICES table. If there are internal instance variables within the bean, these variable values represent the current state associated with the office, which is also represented by the column values in this row of the OFFICES table. This state is independent of any particular user session.

Either type of bean may access a database, but they will typically do it in quite different ways.

2. Session Bean Database Access

A session bean will typically access a database in a series of one or more JDBC calls on behalf of the user represented by the bean. An application server classifies session beans into two categories, depending on how the bean manages state:

Stateless session bean. This type of bean does not maintain any status information across method invocations. It carries out its actions on behalf of one user at a time, and one request at a time. Each request to the bean is independent of the last. With this restriction, every invocation of the bean must carry with it (in the form of the parameters passed with the invocation) all of the information needed to carry out the request.
Stateful session bean. This type of bean maintains status information across method invocations. The bean needs to “remember” information from its previous invocations (its state) to carry out the tasks requested by later invocations. It uses private instance variables to hold the information.

The next two sessions show examples of application tasks that are most easily implemented as each type of session bean. You specify whether a session bean is stateless or stateful in the deployment descriptor for the bean, which contains information supplied to the application server on which the bean is deployed.

An application server on a busy web site can easily have more session beans and other EJBs in use than it has main memory available to store them. In this situation, the application server will keep a limited number of session bean instances active in its main memory. If a user associated with a currently inactive session bean becomes active (i.e., one of his or her web site clicks must be processed), the application server chooses another instance of the same bean class and passivates it—that is, it saves the values of any instance variables defined for the bean and then reuses the bean to serve the user session needing activation.

Whether a session bean is stateful or stateless has a significant impact on this passivation and activation. Since a stateless session bean does not need its status preserved across method invocations, the application server does not need to save its instance variable values when it passivates the bean, and does not need to restore instance variable values when it reactivates the bean. But for a stateful session bean, the application server needs to copy its instance value variables to persistent storage
(a disk file or a database) when it passivates the bean, and then restore those values when it reactivates the bean. Thus, stateful session beans can significantly diminish the performance and throughput of an application server on a busy site. Stateless beans are preferable for performance, but many applications are difficult or impossible to implement without using stateful beans.

3. Using JDBC from a Stateless Session Bean

Figure 22-4 shows a simple example of an application that can easily be handled with stateless session bean database access. A page on a web site displays the current price of a company’s stock when the page is displayed. The page can’t be static, since the displayed price will change minute by minute. So when the user’s browser requests the page, the web server hands off the request to an application server, which eventually invokes a method of a session bean. The session bean can use JDBC to send a SQL SELECT statement to a database of current stock prices, and receives back the answer as one line of query results. The session bean reformats the stock quote as a fragment of a web page, and passes it back to the web server for display to the user.

Stateless session beans can perform more complex functions as well. Suppose the same company has a page on its web site where a user can request a product catalog by filling in the contents of a small form. When the form is filled in and the user clicks the Send button, the browser sends the data from the form to the web server, which again hands off the request to an application server. This time, a different method of the session bean is invoked, and receives the data from the form as parameters. The session bean can use JDBC to send a SQL INSERT statement to a database table holding pending catalog requests.

In each of these examples, all of the information that the session bean needs to carry out its task is passed to it with the method invocation. When the bean has completed its task, the information is not needed anymore. The next invocation again receives all of the information it needs with the next invocation, so there is no need to carry over status information. Even more important, the database activity on each invocation is completely independent from every other invocation. No database transaction spans multiple method invocations.

4. Using JDBC from a Stateful Session Bean

Many web interactions can’t live with the limitations imposed by stateless session beans. Consider a more complex web-based form that spans four pages. As the user fills out each page and sends it to the web site, the session bean must accumulate the information and retain it across the four page clicks until all of the data is ready to be captured into a database. The need to retain information across method invocations calls for a stateful session bean.

Another example in which a stateful session bean is appropriate is a commercial web site where a user shops online and accumulates a list of items to be purchased in an online shopping cart. After 40 or 50 clicks through the web site, the user may have accumulated five or six items in the shopping cart. If the user then clicks a button requesting display of the current shopping cart contents, those contents are probably most easily maintained as session bean state.

In both of these examples, the session bean requires continuity of database access to effectively accomplish its tasks. Figure 22-5 shows the pattern, and the contrast to the pattern of interactions in Figure 22-4. Even if the bean can be implemented without instance variables (for example, by storing all of its state information in a back-end database), it needs one continuous database session to carry out its database access. The client-side API for the DBMS maintains this session, and the API itself will need to maintain session-state information across session bean method invocations.

5. Entity Bean Database Access

It’s possible to implement complete, sophisticated web site applications using session beans deployed on a J2EE application server. However, programming an application using session beans tends to produce more procedural, and less object-oriented code. The object-oriented philosophy is to have object classes (in this case, EJB classes) represent real-world entities, such as customers or offices, and object instances represent individual customers or offices. But session beans don’t represent any of those entities; they represent currently active user sessions. When database interaction is handled directly by session beans, the representation of real-world entities is basically left in the database; it doesn’t have an object counterpart.

Entity beans provide the object counterpart for real-world entities and the rows in a database that represent them. Entity bean classes embody customers and offices; individual entity bean instances represent individual customers and individual offices. Other objects (such as session beans) within the application server can interact with customers and offices using object-oriented techniques, by invoking the methods of the entity beans that represent them.

To maintain this object-oriented model, there must be very close cooperation between the entity-bean representations of entities and their database representations.

If a session bean invokes a customer entity bean method that changes a customer’s credit limit, that change must be reflected in the database, so that an order-processing application using the database will use the new limit. Similarly, if an inventory management application adds to the quantity on hand for a particular product in the database, that product’s entity bean in the application server must be updated.

Just as an application server will passivate and reactivate session beans as necessary, it will passivate and reactivate entity beans repeatedly in response to a heavy workload. Before the application server passivates an entity bean, the bean’s state must be saved in a persistent way, by updating the database. Similarly, when the application server reactivates an entity bean, its instance variables must be set to their values just before it was passivated, by reloading those values from the database. The entity bean class defines callback methods that an entity bean must provide to implement this synchronization.

There is close correspondence between actions carried out on entity beans and database actions, as shown in Table 22-1. The J2EE specification provides two alternative ways to manage this coordination:

Bean-managed persistence. The entity bean itself is responsible for maintaining synchronization with the database. The application programmer who develops the entity bean and codes its implementation methods must use JDBC to read and write data in the database when necessary. The application server container notifies the bean when it takes actions that require database interaction.
Container-managed persistence. The EJB container provided by the application server is responsible for maintaining synchronization with the database. The container monitors interaction with the entity bean, and automatically uses JDBC to read and write data in the database and update the instance variables within the bean when needed. The application programmer who develops the entity bean and codes its implementation methods can focus on the business logic in the bean, and assumes that its instance variables will accurately represent the state of the data in the database.

Note that entity beans are always stateful—the distinction between these two bean types is not the difference between stateless and stateful beans, but rather, the difference between who is responsible for maintaining proper state. The next two sections discuss the practical issues associated with each type of entity bean, and the trade-offs between them.

5.1. Using Container-Managed Persistence

An entity bean’s deployment descriptor specifies that an entity bean requires container- managed persistence. The deployment descriptor also specifies the mapping between instance variables of the bean and columns in the underlying database. The deployment descriptor also identifies the primary key that uniquely identifies the bean and the corresponding database row. The primary key value is used in the database operations that store and retrieve variable values from the database.

With container-managed persistence, the EJB container is responsible for maintaining synchronization between the entity bean and the database row. The container calls JDBC to store instance variable values into the database, to restore those values, to insert a new row into the database, and to delete a row—all as required by actions on the bean. The container will call the bean’s ejbStore() callback method before it stores values in the database, to notify the bean that it must get its variable values into a consistent state. Similarly, the container will call the bean’s ejbLoad() callback method after loading values from the database, to allow the bean to do appropriate post processing (for example, calculating a value that was not itself persisted, based on values that were). In the same way, the bean’s ejbRemove() method will be called before the container deletes the row from the database, and ejbCreate() and ejbPostCreate() are called in conjunction with inserting a new row. For many entity beans, these callback methods will be empty, since the container handles the actual database operations.

5.2. Using Bean-Managed Persistence

If an entity bean’s deployment descriptor specifies bean-managed persistence, the container assumes that the entity bean will handle its own database interaction. When a new entity bean is first created, the container calls the bean’s ejbCreate() and ejbPostCreate() methods. The bean is responsible for processing the corresponding INSERT statement for the database. Similarly, when an entity bean is to be removed, the container calls the bean’s ejbRemove() method. The bean is responsible for processing the corresponding DELETE statement for the database, and when the bean returns from the ejbRemove() method, the container is free to actually remove the bean itself and reuse its storage.

Bean loading is similarly handled by a container call to ejbLoad(), and storing by a call by the container to ejbStore(). The bean is similarly notified of passivation and activation by callbacks from the container. Of course, nothing limits the entity bean’s database interaction to these callback methods. If the bean needs to access the database during the execution of one of its methods, the bean can make whatever JDBC calls it needs. The JDBC calls within the callback methods are strictly focused on managing bean persistence.

5.3. Container-Managed and Bean-Managed Trade-Offs

You might naturally ask why you would ever want to use bean-managed persistence when container-managed persistence eliminates the need to worry about synchronizing with the database. The answer is that container-managed persistence has some limitations:

Multiple databases. For most application servers, entity beans must be mapped into a single database server. If entity bean data comes from multiple databases, then bean-managed persistence may be the only way to handle database synchronization.
Multiple tables per bean. Container-managed persistence works well when all of the instance variables for an entity bean come from a single row of a single table—i.e., when there is a one-to-one correspondence between bean instances and table rows. If an entity bean needs to model a more complex object, such as an order header and individual line items of an order, which come from two different, related tables, bean-managed persistence is usually required, because the bean’s own code must provide the intelligence to map to and from the database.
Performance optimizations. With container-managed persistence, a container must make an all-or-nothing assumption about persisting instance variables. Every time the variables must be stored or loaded, all of the variables must be handled. In many applications, the entity bean may be able to determine that depending on its particular state, only a few of the variables need to be processed. If the entity bean holds a lot of data, the performance difference can be significant.
Database optimizations. If the methods of an entity bean that implement its business logic involve heavy database access (queries and updates), then some of the database operations that the container will carry out in a container-managed persistence scheme may be redundant. If bean-managed persistence is used instead, the bean may be able to determine exactly when database operations are required for synchronization and when the database is already up to date.

In practice, these limitations often prevent the use of container-managed persistence in today’s deployments. Enhancements in newer versions of the EJB specification are designed to address many of these shortcomings. However, bean-managed persistence remains a very important technique with the currently available application servers.

6. EJB 2.0 Enhancements

EJB 2.0 represents a major revision to the EJB specification. Many of the enhancements in EJB 2.0 were incompatible with the corresponding capabilities in EJB 1.x. To avoid breaking EJB 1.x-compatible beans, EJB 2.0 provides complementary capabilities in these areas, allowing side-by-side coexistence of EJB 1.x and EJB 2.0 beans. A complete description of the differences between EJB 1.x and EJB 2.0 is well beyond the scope of this book. However, several of the differences were motivated by difficulties in using container-managed persistence under the EJB 1.x specification, and those changes directly affect database processing within EJBs.

One difficulty with EJB 1.x has already been mentioned—the difficulty of modeling complex objects that draw their data from multiple database tables or that contain nonrelational structures like arrays and hierarchical data. With EJB 1.x, you could model a complex object as a family of inter-related entity beans, each drawn from one table. This approach allowed the use of container-managed persistence, but the relationships between pieces of the object need to be implemented in applications code within the bean. Ideally, these internal details within the complex object should be hidden from applications code. Alternatively, with EJB 1.x, you could model a complex object as a single entity bean, with data in the bean’s instance variables drawn from multiple related tables. This achieves the desired application code transparency, but container-managed persistence could be used when an entity bean draws its data from multiple tables.

EJB 2.0 addresses this issue through the use of abstract accessor methods, which are used to set and retrieve every persistent instance variable within an entity bean. The container actually maintains the storage for the variables and the variable values. The bean explicitly calls a get() accessor method to retrieve an instance variable value and a set() accessor method to set its value. Similarly, there are get() and set() abstract accessor methods for every relationship that links the rows in the database that contribute data to the entity bean. Many-to-many relationships are easily handled by mapping them into Java collection variables.

With these new features, the container has complete knowledge of all the instance variables used by a bean, and every access that code within the bean makes to the instance variables. The entity bean can represent a complex object that draws data from multiple database tables, hiding the details from the applications code. But container-managed persistence can now be used, because the container “knows” all about the various parts of the object and the relationships among the parts.

Another problem with the EJB 1.x specification is that while database interactions were standardized, the finder methods that are used to search the active entity beans were not. The finder methods implement capabilities like searching for a particular entity bean by primary key, or searching for the set of beans that match a particular criterion. Without this standardization, portability across application servers was compromised, and searches of entity beans often required recourse to searching the underlying database.

EJB 2.0 addresses the searching limitations through the use of abstract select methods that search entity beans. The select methods use a newly defined EJB 2.0 Query Language (EJBQL). While the query language is based on SQL-92, it includes constructs such as path expressions that are decidedly nonrelational.

Finally, EJB 2.0 was designed to align with the SQL:1999 standard and its abstract data types. Support for these types somewhat simplifies the interaction between entity beans and the database for DBMS products that support abstract types. At this time, few DBMS products support them.

Source: Liang Y. Daniel (2013), Introduction to programming with SQL, Pearson; 3rd edition.