Replication: Microsoft SQL Server 2000 Tool

All replication operations are based on the principle that the changing of data on one data source is followed by the reproduction of that new data on one or more additional distributed locations. This may be done to reduce network congestion, to partition data according to geographic or product lines, or to provide a failsafe duplication of servers for critical data.

The roles in a replication scenario are those of published, distributor, and subscriber. As we will see, one server can in some circumstances play all three roles, but must play at least one role. The publisher contains the articles or objects that are to be published. The distributor contains the data that is to be changed and promulgated. The distributor may or may not be the same server as the publisher. When it is NOT the same, it must receive changed data from the publisher before it forwards that data to the subscriber. It is the distribution database that does most of the work in a replication.

The subscriber receives the changed data from the distributor. There may be multiple subscribers. It is possible to have a configuration where the subscribers also can make changes to the data. This is known as an updating subscriber. The changes made by the updating subscriber are then sent to the publisher, onto the distributor, and then onto the other subscribers.

The items subscribed to are known as publications, which can be tablesm wueries, stored procedures, indexes, and virtually any SQ: Server object. You can also export filtered sets of data, unique to each subscriber. These data may be filtered vertically, in which case only certain rows are sent to certain subscribers, or horizontally, in which only certain columns are sent. Or you can filter in both directions, and you can extend the filtering to include joins of other tables.

Scenarios

There are 6 basic configurations possible in any replication scenario, and the choice of them is a function of the requirements of the business. They are:

The subscriptions can be either push (managed by the publication server) win which articles are pushed to the subscribers, or pull (managed by the individual subscriber machines, in which articles are requested by the subscribers. A special case of anonymous subscriptions are common on the Internet is where anyone can subscribe and request articles. Publications are managed on whatever is the distribution server in the distribution database, which will contain all the transactions and all the information necessary to make the data transfers, publishers, articles, agents, subscribers, etc in special tables set up to monitor the process.

Agents

Replication is accomplished through SQL Server Agents which wake up and perform their tasks at programmed times. These can be seen and manipulated in the Enterprise Manager Console (Management|SQLServerAgent|Jobs)

Replication Implementation

Some implementations require immediate replications, such as an airline reservation system. This is called immediate transaction consistency. To accomplish this a two phase commit is used to assure that the transactions are implemented on all servers. If they are not, then the transaction is rolled back. This type of processing requires a high speed LAN.

Other applications require a less real time synchronization. This is called latent transactional consistency, and it provides for synchronization on all the servers, though not an immediate synchronization. The degree of latency is dictated by the goodness of the connections between the servers, and the business requirements of the application. Site autonomy is also a consideration in implementing these types of replication configurations. The degree to which the individual sites are programmed to recognize each other and accept changes is also important to the application as a whole.

There are at least 6 different models of data distribution. Each type implies a different amount of latency and site autonomy.

 

Snapshot replication is simply a copy of each table in a publication. That copy then is sent in its entirety to the subscribers. A schema and datafile are created and transmitted. It takes relatively a lot of bandwidth, but is simple to implement.

Transactional replication captures transactions from the transaction log and forwards it to the subscribers. It is an efficient method because only the minimum amount of data is moved. Updates are instantaneous, or at programmed intervals.

Merged replication allows updates at all sites, then the updated and added data is merged at the publishing site and to all subscribers. Because of multiple update possibilities, there are some times when conflicts will have to be resolved. This can be accomplished in a stored procedure that contains the logical workflow of the business rules. Transactional and merge replications are subject to row size limitations of 6000 and 8000 bytes.

In order to resolve conflicts and assure proper merging, timestamp information in each replicated table is advised. Additionally, unique identifier fields, and GUID columns are frequently used.

Methodology

  1. Replication scripts are all setup the same. Create a distributor to do the publishing. (It may be the same machine as the publisher)
  2. Setup Publishing
  3. Create publishing articles for distribution.
  4. Define subscribers, and subscribe.

The way to do these things is extremely easy. The scripts are created and managed from the Enterprise Manager under the Tools|Replication|Create and Manage Publications. There are wizards there that will walk you through the process.

After you have set up the replication scripts, you must then test and debug them, monitor them to ensure that the scripts you wrote do what you expect them to do. Again, the Enterprise Manager is the best tool. Look under the Replication branch in the console tree for Publishers, Agents, and Replication alerts. In addition to the Enterprise Manager, there are some system stored procedures which allow you to check up on various aspects of the replication process. Among the most important are these:

sp_helppublication ,
sp_helparticle,
sp_helpdistributor,
sp_helpsubscriberinfo,
sp_helpsubscription,
sp_helpreplcounters, and
sp_publicationvalidation.

Finally, the NT Performance Monitor will show activity witfrom the SQL Server objects added there on SQL Server installation. These objects are:

SQLServer:Replication Agents,
SQLServer:Replication Dist,
SQLServer:Replication Logreader,
SQLServer:Replication Merge, and
SQLServer:Replication Snapshot.

 

Resources

 

This resource discusses replication project at length.

This resource discusses two way replication.

Metro NY / NJ SQL Server Consultants

We specialize is custom database software. Call us for a free consultation (973) 635 0080 or email us at paladn.com

 

Database programming: Database Development
Computer Programming: Web Site Services
Computer Programming: Custom Software
Computer Consulting: IT Consulting
Paladin Consultants, Inc. Home Page
Computer Consulting: Contact Us

web design Web Design - Web Design Companies
Home
Services
Resources
Portfolio
About Us
Careers
Contact US
Paladin Consultants - Computer Consultants