SQL Server Replication: An Overview

Generally, replication is based on one of two different concepts:

Using transaction logs
Using triggers

As already stated in Chapter 16, the Database Engine keeps all values of modified rows (“before” as well as “after” values) in system files called transaction logs. If selected rows need to be replicated, the system starts a new process that reads the data from the transaction log and sends it to one or more target databases.

The other method is based on triggers. The modification of a table that contains data to be replicated fires the corresponding trigger, which in turn creates a new table with the data and starts a replication process.

Both concepts have their benefits and disadvantages. The log-based replication is characterized by improved performance, because the process that reads data from the transaction log runs asynchronously and has little effect on the performance of the overall system. On the other hand, the implementation of log-based replication is very complex for companies, because the Database Engine not only has to manage additional processes and buffers but also has to solve the concurrency problems between system and replication processes that access the transaction log.

NOTE The Database Engine uses both concepts: the transaction log method for transactional replication and triggers for merge replication. (Transactional and merge replications are described in detail later in this chapter.)

1. Publishers, Distributors, and Subscribers

The Database Engine replication is based on the so-called publisher-subscriber metaphor. This metaphor describes the different roles servers can play in a replication process. One or more servers publish data that other servers can subscribe to. In between there exists a distributor that stores the changes and forwards them further (to the subscribers). Hence, a node can have one (or more) different roles in a replication scenario:

Publisher (or publishing server) Maintains its source databases, makes data available for replication, and sends the modified data to the distributor
Distributor (or distribution server) Receives all changes to the replicated data from the publisher and stores and forwards them to the appropriate subscribers
Subscriber (or subscription server) Receives and maintains published data

A database server can play one or more roles in a replication process. For example, a server can act as the publisher and the distributor at the same time. This scenario is appropriate for a process with few replications and few subscribers. If there are a lot of subscribers for the publishing information, the distributor can be located on its own server. Figure 18-1 shows a simple scenario in which one instance is both the publishing and distribution server and three other instances are subscription servers. (The section “Replication Models” later in this chapter discusses in detail possible replication scenarios.)

NOTE You can replicate only user-defined databases.

2. Publications and Articles

The unit of data to be published is called a publication. An article contains data from a table and/or one or more stored procedures. A table article can be a single table or a subset of data in a table. A stored procedure article can contain one or more stored procedures that exist at the publication time in the database. A publication contains one or more articles. Each publication can contain data only from one database.

NOTE A publication is the basis of a subscription. This means that you cannot subscribe directly to an article, because an article is always part of a publication.

A filter is the process that restricts information, producing a subset. Therefore, a publication contains one or more of the following items that specify types of table articles:

Table
Vertical filter
Horizontal filter
A combination of vertical and horizontal filters

A vertical filter contains a subset of the columns in a table. A horizontal filter contains a subset of rows in a table.

Publications are tightly connected to subscriptions. A subscription can be initiated in two different ways:

Using a push subscription
Using a pull subscription

With a push subscription, all the administration of setting up subscriptions is performed on the publisher during the definition of a publication. (Besides the publisher, the distributor also creates and manages push subscriptions.) Push subscriptions simplify and centralize administration, because the usual replication scenario contains one publisher and many subscribers. The benefit of a push subscription is higher security, because the initialization process is managed at one place. On the other hand, the performance of the distributor can suffer because the overall distribution of subscriptions runs at once.

With a pull subscription, the subscriber initiates and manages the subscription. The pull subscription is more selective than the push subscription, because the subscriber can select publications to subscribe to. In contrast to the push subscription, the pull subscription should be used for publications with a high number of subscribers.

NOTE The downloading of data using the Internet is a typical form of pull subscription.

There is a special type of pull subscription called anonymous subscription. Generally, information concerning subscribers is kept on the distribution server. If the workload on this server should be reduced (because of too many subscribers, for instance), it is possible to allow subscribers to initiate their own (“anonymous”) subscriptions.

3. Agents

During the data replication process, the Database Engine uses several agents to manage different tasks. The system supports, among others, the following agents:

Snapshot agent
Log Reader agent
Distribution agent
Merge agent

These agent types are discussed following a brief introduction to the distribution database.

4. The distribution Database

The distribution database is a system database that is installed on the distribution server when the replication process is initiated. This database holds all replicated transactions from the publisher that need to be forwarded to the subscribers.

In many cases, a single distribution database is sufficient. However, if multiple publishing servers communicate with a single distribution server, you can create a distribution database for each publishing server. Doing so ensures that the data flowing through each distribution database is distinct.

Snapshot Agent

The Snapshot agent generates the schema and data of the published tables and stores them on the distribution server. The schema of a table and the corresponding data file build the synchronization set that represents the snapshot of the table at a particular time. (A snapshot is essentially what it sounds like: a snapshot of the data to be replicated.) The status of the synchronization of that set is recorded in the distribution database. Whether the Snapshot agent creates new snapshot files each time it runs depends on the type of replication and options chosen.

4.1. Log Reader Agent

If the transaction log of the system is used to replicate data, all transactions that contain the data to be replicated are marked for replication. A component called the Log Reader agent searches for marked transactions and copies them from the transaction log on the publisher to the distribution server. These transactions are stored in the distribution database. Each database that uses the transaction log for replication has its own Log Reader agent running on the distribution server.

4.2. Distribution Agent

After the transactions and snapshots are stored in the distribution database, they have to be moved to the subscribers. This task is handled by the Distribution agent, which moves transactions and snapshots to subscribers, where they are applied to the target tables in the subscription databases.

The task of the Distribution agent is different for pull and push subscriptions. For push subscriptions, the agent pushes out the changes to the subscriber. For pull subscriptions, the agent pulls the transactions from the distribution server. (All actions that change data on the publisher are applied to the subscriber in chronological order.)

4.3. Merge Agent

As you already know, the Snapshot agent prepares files containing the table schema and data and stores them at the distributor site. If both the publisher and subscribers can update replicated data, then a synchronization job is necessary that sends all changed data to the other sites. This job is performed by the Merge agent. In other words, the Merge agent can send replicated data to the subscribers and to the publisher. Before the send process is started, the Merge agent also stores the appropriate information that is used to track possible conflicts.

5. Replication Types

The Database Engine supports several replication types, which are discussed in the following subsections:

Transactional
Peer-to-peer
Snapshot
Merge

5.1. Transactional Replication

In transactional replication, the transaction log of the system is used to replicate data. All transactions that contain the data to be replicated are marked for replication. The Log Reader agent searches for marked transactions and copies them from the transaction log on the publisher to the distribution database. The Distribution agent moves transactions to subscribers, where they are applied to the target tables in the subscription databases.

NOTE All tables published using transactional replication must explicitly contain a primary key. The primary key is required to uniquely identify the rows of the published table, because a row is the transfer unit in transactional replication.

Transactional replication can replicate tables (or parts of tables) and one or more stored procedures. The use of stored procedures by transactional replication increases performance, because the amount of data to be sent over a network is usually significantly smaller. Instead of replicated data, only the stored procedure is sent to the subscribers, where it is executed.

You can configure the delay of synchronization time between the publisher on one side and subscribers on the other during a transactional replication. (All these changes are propagated by the Log Reader and Distribution agents.)

NOTE Before transactional replication can begin, a copy of the entire database must be transferred to each subscriber; this is performed by executing a snapshot.

A special form of transactional replication is peer-to-peer transactional replication, discussed next.

5.2. Peer-to-Peer Transactional Replication

Peer-to-peer is another form of transactional replication, in which each server is at the same time a publisher, distributor, and subscriber for the same data. In other words, all servers contain the same data, but each server is responsible for the modification of its own partition of data. (Note that data partitions on different servers can intersect.)

Peer-to-peer transactional replication is best explained through an example. Suppose that a company has several branch offices in different cities and that each office server has the same data set as all other servers. On the other hand, the entire data set is partitioned in subsets, and each office server can update only its own subset of data. When data is modified on one of the office servers, the changes are replicated to all other servers (subscribers) in the peer-to-peer network. (Users in each office can read data without any restrictions.)

The benefits of this replication form are

The entire system scales well.
The entire system provides high availability.

A system that supports peer-to-peer transactional replication scales well because each server serves only local users. (Users can update only the data partition that belongs to their local server. For read operations, all data is stored locally, too.)

The high availability is based on the fact that if one or more servers go offline, all other servers can continue to operate, because all data they need for read and write operations is stored locally. When an offline server is online again, the replication process restarts and the server receives all data modifications that have happened at the other sites.

Conflict Detection in Peer-to-Peer Replication With peer-to-peer replication, you can change data at any node. Therefore, data changes at different nodes could conflict with each other. (If a row is modified at more than one node, it can cause a conflict.)

The Database Engine supports the option to enable conflict detection across a configured topology. With this option enabled, a conflicting change is treated as a critical error that causes the failure of the Distribution agent. In the event of a conflict, the scenario remains in an inconsistent state until the conflict is resolved and the data is made consistent on all participating servers.

NOTE You can enable conflict detection using the system procedures sp_addpublication and sp_configure_peerconflictdetection.

Conflicts in peer-to-peer transactional replication are detected by the stored procedures that apply changes to each node, based on a hidden column in each published table. This hidden column stores an identifier that combines a unique ID that you specify for each node and the version of the row. The procedures are executed by the Distribution agent and they apply insert, update, and delete operations from other peers. If one of the procedures detects a conflict when it reads the hidden column value, it raises an error.

NOTE The hidden column can be accessed only by a user that is logged in through the dedicated administrator connection (DAC). For the description of DAC, see Chapter 15.

When a conflict occurs in peer-to-peer transactional replication, the “Peer-to-peer conflict detection alert” is raised. You should configure this alert so that you are notified when a conflict occurs. (The previous chapter explains how alerts can be configured and discusses the ways to notify operators.) Microsoft Docs describes several approaches for handling the conflicts that occur.

NOTE You should try to avoid conflicts in a peer-to-peer replication, even if conflict detection is enabled.

5.3. Snapshot Replication

The simplest type of replication, snapshot replication, copies the data to be published from the publisher to all subscribers. (The difference between snapshot replication and transactional replication is that the former sends all the published data to the subscribers and the latter sends only the changes of data to the subscribers.)

NOTE Transactional and snapshot replications are one-way replications, meaning the only changes to the replicated data are made at the publishing server. Therefore, the data at all subscription servers is read-only, except for the changes made by replication processes.

In contrast to transactional replication, snapshot replication requires no primary key for tables. The reason is obvious: the unit of transfer in snapshot replication is a snapshot file and not a row of a table. Another difference between these two replication types concerns a delay in time: snapshot replication is replicated periodically, which means the delay is significant because all published data (changed and unchanged) is transferred from the publisher to the subscribers.

NOTE Snapshot replication does not use the distribution database directly. However, the distribution database contains status information and other details that are used by snapshot replication.

5.4. Merge Replication

In transactional and snapshot replication, the publisher sends the data, and a subscriber receives it. (There is no possibility that a subscriber sends replicated data to the publisher.) Merge replication allows the publisher and subscribers to update data to be replicated. Because of that, conflicts can arise during a replication process.

When you use the merge replication scenario, the system makes three important changes to the schema of the publication database:

It identifies a unique column for each replicated row.
It adds several system tables.
It creates triggers for tables in which data is replicated.

The Database Engine creates or identifies a unique column in the table with the replicated data. If the base table already contains a column with the UNIQUEIDENTIFIER data type and the ROWGUIDCOL property, the system uses that column to identify each replicated row. If there is no such column in the table, the system adds the column rowguid of the UNIQUEIDENTIFIER data type with the ROWGUIDCOL property.

NOTE UNIQUEIDENTIFIER columns may contain multiple occurrences of a value. The ROWGUIDCOL property additionally indicates that the values of the column of the UNIQUEIDENTIFIER data type uniquely identify rows in the table. Therefore, a column of the data type UNIQUEIDENTIFIER with the ROWGUIDCOL property contains unique values for each row across all networked computers in the world and thus guarantees the uniqueness of replicated rows across multiple copies of the table on the publisher and subscribers.

The addition of new system tables provides the way to detect and resolve any update conflict. The Database Engine stores all changes concerning the replicated data in the merge system tables msmerge_contents and msmerge_tombstone and joins them with the table that contains replicated data to resolve the conflict.

The Database Engine creates triggers on tables that contain replicated data on all sites to track changes to the data in each replicated row. These triggers determine the changes made to the table, and they record them in the msmerge_contents and msmerge_tombstone system tables. Conflict detection is done by the Merge agent using the column lineage of the msmerge_contents system table when a conflict is detected. The resolution of it can be either priority based or custom based.

Priority-based resolution means that any conflict between new and old values in the replicated row is resolved automatically based on assigned priorities. (The special case of the priority-based method specifies the “first wins” method, where the timely first change of the replicated row is the winner.) The priority-based method is the default. The custom-based method uses customized triggers based on business rules defined by the database administrator to resolve conflicts.

6. Replication Models

The previous section introduced different replication types that the Database Engine uses to distribute data between different nodes. The replication types (transactional, snapshot, peer- to-peer, and merge) provide the functionality for maintaining replicated data. Replication models are used by a company to design its own data replication. Each replication model can be implemented using one or more existing replication types. Both the replication type and replication model are usually specified at the same time.

Depending on requirements, several replication models can be used. The basic ones are as follows:

Central publisher with distributor
Central publisher with a remote distributor
Central subscriber with multiple publishers
Multiple publishers with multiple subscribers

The following sections describe these models.

6.1. Central Publisher with Distributor

In the central publisher with distributor model, there is one publisher and usually one distributor, which are hosted on one instance of the Database Engine (see Figure 18-1 at the beginning of this chapter). The publisher creates publications that are distributed by the distributor to several subscribers. The publications designed by this model and received at a subscriber are usually read-only.

The advantage of this model is its simplicity. For this reason, the model is usually used to create a copy of a database, which is then used for interactive queries and simple report generation. (Another situation for using this model is to maintain a remote copy of a database, which could be used by remote systems in the case of communication breakdown.)

For this model, you can use the transactional replication type.

6.2. Central Publisher with a Remote Distributor

If the amount of publishing data is not very large, the publisher and distributor can reside on one server. Otherwise, if you have a heavy load, using two separate servers for publishing and distribution is recommended because of performance issues. (If there is a heavy load of data to be published, the distributor is usually the bottleneck.) Figure 18-2 shows the replication model with the central publisher and a separate distributor.

NOTE This scenario can be used as a starting point to increase a number of publishing servers and/or subscribing servers.

6.3. Central Subscriber with Multiple Publishers

The scenario described at the beginning of this chapter of the traveling salesperson who transmits data to headquarters is a typical example of the central subscriber with multiple publishers. The data is gathered at a centralized subscriber, and several publishers send their data.

For this model, you can use either the peer-to-peer transactional replication type or the merge replication type, depending on the use of replicated data. If publishers publish (and therefore update) the same data to the subscriber, merge replication should be used. If each publisher has its own data to publish, peer-to-peer transactional replication should be used. (In this case, published tables will be filtered horizontally, and each publisher will be the owner of a particular table fragment.)

6.4. Multiple Publishers with Multiple Subscribers

The replication model in which some or all of the servers participating in data replication play the role of the publisher and the subscriber is known as multiple publishers with multiple subscribers. In most cases, this model includes several distributors that are usually placed at each publisher (see Figure 18-3).

This model can be implemented using merge replication only, because publications are modified at each publishing server. (The only other way to implement this model is to use the distributed transactions with two-phase commit.)

Source: Petkovic Dusan (2020), Microsoft SQL Server 2019: A Beginner’s Guide, Seventh Edition-McGraw-Hill Education.