  
Day 20
Database Replication
In the 1970s, the mainframe computer was the main instrument used in the delivery
of data to the enterprise. Databases were centralized, and clients were merely dumb
terminals. This paradigm, however, met its partial demise because it was expensive
and unfriendly to the user.
In the 1980s, the local area network (LAN) came into being, and data was distributed
among groups of users tied into a common network. This reduced development costs
for some, but fragmented the data into smaller databases. Organizational data was
spread out over multiple locations, which meant much data entry effort was duplicated
and groups did not communicate efficiently.
The 1990s has brought the need for organizations to communicate on a much larger
scale. Wide area networks (WANs) provide a means for communicating among individuals.
The speed and reliability of WANs, however, are not generally fast enough to allow
for constant connection to databases located in other cities or countries. It is
necessary to have databases located locally that communicate with one another.
The 90s have also brought about the widespread use of laptop PCs. More and more
workers are performing their daily chores off-line. These individuals want access
to data contained on WANs, but are unable to attach economically from cars, airplanes,
hotels, and client offices.
The purpose of this chapter is to show you how to facilitate the environment in
which we now work on a daily basis. You learn about creating databases that can be
copied to other sites. You then learn how to coordinate the changes made to these
databases among users at different sites. You learn about database replication.
What Is Database
Replication?
When we refer to database replication, we are talking about the act of creating
copies of a database and coordinating the changes made to the data among all copies.
The original database is referred to as the Design Master. Each copy of the database
is referred to as a replica. The combination of the Design Master and all the replicas
of the Design Master is referred to as the replica set. The act of creating the components
of the replica set, and keeping the data contained in it synchronized, is referred
to as database replication.
By performing database replication, you permit users to work on the data that
is most convenient for them to use. This is important in large organizations with
offices in multiple sites, or among organizations with a significant population of
remote or mobile users.
The Microsoft Jet engine allows for several ways to perform database replication.
This includes the use of the Windows 95 Briefcase and the Microsoft Access Replication
Manager, and through programming using Data Access Objects (DAO). The lesson today
focuses on the use of DAO to perform replication.
Why Use Database
Replication?
There are numerous reasons why you may want to consider using database replication
in your Visual Basic 5 database application. If you work in a large organization,
you may need to deploy your application over a wide area network environment. This
typically requires you to keep the main copy of the database, the Design Master,
at the central office, and create replica sets across all the other offices.
You may also need to build an application for use by remote users. An example
of this might be a customer contact management system for your sales staff. Each
salesperson could have a replica of the Design Master to review and update while
visiting clients. All the salespeople could then update all the changes they make
to the Design Master. In turn, each salesperson could receive all changes made by
all other members of the sales force to the Design Master. This is referred to as
synchronizing the data.
Generally, to back up a database, the data files must be closed to all users.
This is sometimes not practical, however, or even possible. Database replication
can be used in this situation to make a replica of the original database, without
having to close any files or hinder user access to the data contained in the database.
You might also want to use database replication to create a static database for
reporting. In many applications, such as financial applications, data changes constantly.
Mass confusion reigns if users create reports that differ each time they are generated.
By using replication, you can create an unchanging copy of the data to a separate
database that users can then use for reporting and analysis.
When Is Database
Replication Not a Good Idea?
Though database replication can be an invaluable tool, there are scenarios where
it should not be deployed. For example, you may not want to deploy replication when
you are delivering data in an intranet environment. Before deploying a typical Visual
Basic 5 database application in a large organization (for example, an application
with a front-end located on a user workstation and the data on a separate server),
you may want to test the performance of a database application that uses a Web browser
as the front end. This can greatly reduce the maintenance required for the application
and the deployment time to individuals.
You do not want to use replication in applications that are heavily transaction-oriented.
For example, an airline would not want to use replication for a reservation system.
It makes little sense for users to work with a copy of a database that is unreliable,
and therefore unusable, the second after the data is replicated. (Many cynical travelers
believe, however, that airlines do use two reservation systems--one for passengers
and one for baggage.)
You also do not want to use replication in a system where data accuracy is extremely
important, such as emergency response systems. In databases used by law enforcement
or fire departments, for example, you might not be able to replicate data fast enough
to be of value to the user. If, for example, a bank is robbed in Columbus, Ohio,
and the criminal is fleeing towards Cincinnati, you may not have the time to perform
the replication so that the police force in Cincinnati has a description of the criminals.
Additionally, the mode of data transfer used in the synchronization may not be operating
due to circumstances beyond your control.
Making a Database
Replicable
The focus of this exercise is to turn an existing database into a Design Master.
We use the REPLMAST.MDB database that shipped on the CD that came with this
text as our original database. Please locate this database in the \\TYSDBVB5\SOURCE\DATA
directory now and place it into the directory you want to use for this project.
Before we begin, let's open the REPLMAST.MDB file using the Visual Data
Manager (Visdata). This can be done by selecting Add Ins | Visual Data Manager from
the Visual Basic 5 menu. When Visdata loads, select File | Open Database | Microsoft
Access and locate REPLMAST.MDB. Your screen should resemble Figure 20.1.
Figure
20.1. The REPLMAST.MDB database
before it becomes a Design Master.
Note that there are nine tables in this database. You may also recognize this as
a copy of the BOOKS5.MDB database that we used in previous lessons.
Now select Utility | Preferences | Include System Tables. This displays all of
the system tables for this database in the Database window. Your screen should look
like Figure 20.2.
Select the Authors table and open the Fields property. Notice that there are five
fields defined for this table. Open the same property for the BookSales table. Use
Figure 20.3 as a reference.
Finally, open the Properties object in the Database window. Take a look at the
properties that currently exist for this database. Your screen should look similar
to Figure 20.4.
The purpose of this quick exercise was to show you what tables and fields exist
within the database. You now create a Visual Basic project that turns the REPLMAST.MDB
database into a Design Master. After that, you return to Visdata and view the changes
made to this database as a result of becoming a Design Master.
Figure
20.2. The REPLMAST.MDB database
and system tables.
Figure
20.3. The fields of the Authors and BookSales
tables.
Figure
20.4. Database properties before the Design
Master is created.
Creating the Design Master
Start Visual Basic 5 and begin a Standard EXE project. Add a command button to
a form. Set its name property to cmdCreateMaster and its Caption property to &Create
Master. Your form should look similar to Figure 20.5.
Figure
20.5. The main form of REPLDEMO.VBP.
Save the form as REPLDEMO.FRM and the project as REPLDEMO.VBP.
NOTE: Make sure that you have set the Microsoft
DAO 3.5 object library before performing the exercises in this chapter. This can
be done by selecting Project from the main menu, then choosing Preferences. Find
the option for the object library in the dialog that appears and then press OK.
Now, double-click the command button and enter the code from Listing 20.1 in its
Click event.
Listing 20.1. Visual
Basic code for the Create Master command button.
Private Sub cmdCreateMaster_Click()
Dim dbMaster As Database
Dim repProperty As Property
`Open the database in exclusive mode
Set dbMaster = OpenDatabase("c:\tysdbvb5\source\data\replmast.mdb", True)
`Create and set the replicable property
Set repProperty = dbMaster.CreateProperty("Replicable", dbText, "T")
dbMaster.Properties.Append repProperty
dbMaster.Properties("Replicable") = "T"
`Display a message box
MsgBox "You have created a Design Master!"
End Sub
This code opens the REPLMAST.MDB exclusively, creates the Replicable property
and appends it to the database, and then sets the Replicable property to T.
Please note that you must first create this property because it does not exist in
a standard database.
NOTE: Always make a backup copy of your database
before converting it into a Design Master. Once the Design Master is created and
data changes are made, destroy the copy. Later today you will see that making and
using backup copies of the Design Master is dangerous business.
Add a second command button and name it cmdExit, and use E&xit as the caption.
Enter the code from Listing 20.2 into the Click event of this project.
Listing 20.2. The
cmdExit_Click event.
Private Sub cmdExit_Click()
Unload Me
End Sub
Run the project and click the Create Master button. You should see a message box
when the Design Master is created. See Figure 20.6.
Figure
20.6. Confirmation that the Design Master
has been created.
You have created the Design Master. You did not create a new file; rather, you modified
the existing file. Don't try to perform this operation on this same file a second
time. A file can be made a Design Master only once.
Select the Exit button to close the project.
What Happens to
a Database When You Make It Replicable?
The simple routine you wrote and executed in the preceding example made quite
a few changes to the REPLMASTER.MDB database. This section explores these
changes in detail.
Fields Added to
a Replicated Database
Open the Visual Data Manager (Visdata) and load the REPLMAST.MDB database.
Open the BookSales table and then expand the fields. Your screen should look like
Figure 20.7. Compare Figure 20.7 and Figure 20.3 to find the fields that were added.
The following three fields are added to each table when the Design Master is created:
- s_Generation
- s_GUID
- s_Lineage
Figure
20.7. Fields added when the Design Master
is created.
The s_Generation field identifies records that have been changed. All records start
out with a number 1 in this field. This number changes to 0 when the record is modified
in any way. During synchronization between members of the replica set (discussed
later in this chapter) only the records with a 0 in this field are transferred. This
speeds the synchronization process by requiring the transmission of only the records
that were actually changed.
The s_GUID field is a 16-bit GUID field that serves as a unique identifier for
each record. This number remains the same for each record across all members of the
replica set.
The s_Lineage field contains the name of the last replica member to update a record.
This field is not readable by the users of the database.
We discuss these fields as we make changes to the database.
System Tables Added
to a Replicated Database
With the REPLMAST.MDB database still open, let's take a look at the system
tables that now exist. For comparison, refer back to Figure 20.2 to see a listing
of the tables that existed before the creation of the Design Master.
As you can see, many new tables have been added to the REPLMAST.MDB database.
The purpose of these tables is to keep track of synchronization activities to ensure
that members of the replica set are updated properly. For a complete description
of the tables added, look at Visual Basic Books Online and search using the phrase
"Replication System Tables." Then choose the "Changes to Your Database"
topic.
NOTE: You cannot change the information contained
in most of the system tables that are added when a Design Master is created. The
Microsoft Jet engine makes most necessary changes during the synchronization process.
At this point, let's just explore the MSysReplicas table by opening it. This table
contains information on each member of the replica set. At this point, there is only
one member in this set (see Figure 20.8). In the exercise on creating replicas later
in this chapter, this table gains a record for each replica of the Design Master
that is created.
Figure
20.8. The MSysReplicas table when the
Design Master is first created.
Properties Added to the Replicated
Database
The creation of the Design Master added properties to the database. Open the Properties
object in the Database window. Your screen should look similar to Figure 20.9.
Notice that a property named Replicable now exists and has a value of T.
This means that replicas can now be made of this database.
Also note that a property called ReplicaID was added. As you might expect, this
is the unique identifier for this database. Each replica receives its own ReplicaID
as it is created.
Figure
20.9. Database Properties after the Design
Master is created.
A property called DesignMasterID was also created. This property identifies the Design
Master of the replica set. Notice that the DesignMasterID and the ReplicaID for this
database are the same.
For Microsoft Jet version 3.5, the ReplicableBool property is new. This property
performs the same function as the Replicable property, but uses a Boolean data type
where the Replicable property uses a TEXT data type. Note, that the value of the
property is set to True.
The final property added to the database was LastUpdated. This field stores the
ID of the last member of the replica set to update the database.
Properties Added
to a Replicated Table
Open the table properties for any table in the REPLMAST.MDB database.
Notice that fields were added to each table during the creation of the Design Master.
See Figure 20.10.
The Replicable and ReplicableBool properties serve the same function for the table
as for database properties. When these values are set to T, it indicates
that the table can be replicated.
Figure
20.10. Table properties after the Design
Master is created.
Physical Size Changes Made to a
Database When It Becomes Replicable
If you're thinking that the addition of these tables, fields, and properties to
the Design Master will increase the size of your database, you're correct. Approximately
28 bytes are added to each record contained to allow for the replication feature.
This is not much in itself, but when you consider all the tables in a typical application,
and all the records in each table, it can add up to something significant.
Let's perform some mathematical calculations. Say that you have a database with
five tables--a main table and four validation tables. Let's say there are 100,000
records in the main table, and 1,000 records in each of the four validation tables.
Adding replication functionality adds 2,912,000 bytes ([100,000 + 4,000] x 28) to
the total size of each member of your replica set. As you can see, the numbers can
add up quickly!
In addition to the increase for each record, replication adds many new tables,
each of which takes up hard drive space. The space requirements of these tables vary
dramatically depending on the frequency of synchronization, the number of members
in the replica set, and the number of conflicts and errors encountered during the
synchronization process.
In addition to the physical hard drive space you consume, remember that you are
using up fields in each table to track replication information. The Microsoft Jet
engine allows for 255 fields in a table, including the replication fields. Although
it is extremely rare to have tables with 255 fields, it is possible.
NOTE: If you have a table in your database that
is approaching 255 fields in size, you should probably be more concerned about database
normalization than you are with the number of fields consumed by replication. Please
refer to Chapter 16, "Database Normalization," for a complete discussion
of database normalization issues.
The Effect of Database
Replication on AutoNumber Fields
A typical AutoNumber field is incremented by 1 each time a record is added. When
a database is made replicable, these fields become random numbers. Let's look at
a quick example.
Open the database AUTONUMB.MDB found in the \\TYSDBVB5\SOURCE\DATA
directory on the CD that shipped with this book. Now open the tblSupervisors table
as a Dynaset. Your screen should look similar to Figure 20.11.
Figure
20.11. The AUTONUMB.MDB file
before it becomes replicable.
Insert a new record and watch how the ID field increments by 1. Now you can return
to the Visual Basic 5 project REPLDEMO.VBP and modify the cmdCreateMaster
Click event by substituting AUTONUMB.MDB for REPLMAST.MDB.
Run the project and make the AUTONUMB.MDB database replicable.
Now open the database AUTONUMB.MDB in Visdata. Open the tblSupervisors
table and notice what happens to the AutoNumber field when you add a new record.
A random number has been inserted in the AutoNumber field. (See Figure 20.12.)
Figure
20.12. The AutoNumber field becomes random
after the Design Master is created.
NOTE: The effects of database replication are
not the only reason not to use AutoNumber fields in your application. The use of
an AutoNumber, or Counter, field as a primary key in a data table should raise a
red flag for the developer, indicating that the database is not properly constructed
or normalized. AutoNumber fields should be used sparingly, if at all.
Creating Replicas
Copies of the Design Master are referred to as replicas. We now modify the REPLDEMO.VBP
project to create a copy of the REPLMAST.MDB file.
If you need to, start Visual Basic 5 and load the REPLDEMO.VBP project.
Add another command button to your form and name it cmdMakeReplica; insert the caption
&Make Replica.
Next, insert the code from Listing 20.3 into the cmdMakeReplica_Click
event.
Listing 20.3. The
Visual Basic code to make a replica.
Private Sub cmdMakeReplica_Click()
Dim dbMaster As Database
`Open the database in exclusive mode
Set dbMaster = OpenDatabase("c:\tysdbvb5\source\data\replmast.mdb", True)
dbMaster.MakeReplica "c:\tysdbvb5\source\data\copy.mdb", "Replica of " & "dbMaster"
dbMaster.Close
MsgBox "You have created a copy of your database"
End Sub
This code first opens the database REPLMAST.MDB (our Design Master), and
then uses the MakeReplica method to create a new member of the replica set
named COPY.MDB.
NOTE: Create the COPY.MDB file only
once. Trying to create another replica named COPY.MDB causes the program
to fail.
NOTE: Always make a backup copy of a database
before you create a replica. This should be done whether you are creating a copy
of the Design Master, or another replica.
Save your project and execute it. Select the Make Replica button to create the
new database.
NOTE: You can't depend on the traditional backup
and restore methodology to safeguard a Design Master. Changes occur to the Design
Master during the synchronization process. Restoring a backup from a tape drive might
insert a database that is out of synch, and that might not be able to perform synchronization
with other members of the replica set. It is a far better practice to use replication
to create a backup copy that can be made the Design Master in case the original is
corrupted.
Select Exit when COPY.MDB is created. Open your new replica in Visdata.
Explore the properties of the new replica. Notice that you have all of the same tables.
Now open the MSysReplicas table. When we first looked at this table in the Design
Master, there was only one entry. Now there are two. Also note that the Description
field for the new record is the same description you added in the MakeReplica
method you executed earlier. (See Figure 12.13.)
Figure
20.13. The MSysReplicas table after creation
of a replica.
As you can see, it is quite easy to make a replica. A replica can be made out of
any member of the replica set. For example, you could now create a third member of
the set from either REPLMAST.MDB or COPY.MDB.
Synchronizing Members
of a Replica Set
The act of making data in all members of the replica set identical is referred
to as synchronizing data. In this exercise we make data changes to the Design Master
and the replica you created in the previous exercise, and then perform a synchronization
to apply the data changes to the other member of the replica set.
Open COPY.MDB in Visdata. Next, open the Authors table. Add a few records
to this table (make them up). Take note of how the s_Generation field resets to zero
when you add a record. The zero tells the Jet engine that the record is ready to
be copied during the next synchronization.
Also make a change to any existing record in this table. Notice how the 1 in the
s_Generation field also changes to zero. Again, this record is marked to be synchronized.
Your screen should look similar to Figure 20.14.
Figure
20.14. Changes to records cause the s_Generation
field to be set to 0.
Open the REPLMAST.MDB database in Visdata, and open the BookSales table.
Make a change to the first record. When we perform the synchronization, notice how
changes get updated in both members of the replica set.
Now close Visdata and open the REPLDEMO.VBP project in Visual Basic 5.
Add one more command button to the form. Name this button cmdSynch, and set its caption
to &Synchronize. Enter the code from Listing 20.4 into the cmdSynch_Click
event.
Listing 20.4. Code
to perform a bidirectional synchronization.
Private Sub cmdSynch_Click()
Dim dbMaster As Database
`Open the database
Set dbMaster = OpenDatabase("c:\tysdbvb5\source\data\replmast.mdb")
dbMaster.Synchronize "c:\tysdbvb5\source\data\copy.mdb"
MsgBox "The synchronization is complete."
End Sub
This code uses the Synchronize method to copy changes from REPLMAST.MDB
to COPY.MDB, and vice versa.
Run the project and click the Synchronize button. You receive a dialog box notifying
you when the synchronization is complete. Stop the program by selecting Exit.
NOTE: It is a good practice to compact your database
(repair it first, if necessary) before you perform a synchronization. This ensures
that you are not replicating potentially damaged records that might propagate throughout
the entire replica set.
Now open Visdata once more and load the COPY.MDB database. Look first
at the BookSales table and notice that it now reflects the data change you made previously
in the REPLMAST.MDB database. Open the Authors table. Notice how the s_Generation
field has been updated for the new and the changed records. This is illustrated in
Figure 20.15.
Figure
20.15. Data after synchronization. Notice
that the s_Generation field has a new value.
The s_Generation field is incremented by 1 each time a record is changed and a synchronization
is performed. The replica keeps track of the last record sent to a particular member
of the replica set, and only sends records with record numbers that are greater than
the last record sent, and of course, all records with an s_Generation value of zero.
Open the REPLMAST.MDB file and its BookSales table. Notice that the s_Generation
field was updated on the record that was changed.
The Synchronize
Method
In the preceding example, we used the Synchronize method to perform a
bidirectional synchronization of data. This two-way synchronization is the default
implementation of this method. The Synchronize method can also be used to
import information from another database, export changes to another database, and
even synchronize with databases over the Internet.
The structure of the Synchronize statement is
Database.Synchronize pathname, exchange
where pathname is a string value naming the destination of the replication,
and exchange is one of dbRepExportChanges, dbRepImportChanges, dbRepImpExpChanges,
or dbRepSyncInternet.
Use the dbRepExportChanges to send changes to another database without receiving
updates from that database. Use dbRepImportChanges to bring in changes from another
replica set member without sending out any changes. If you enter no exchange value,
or use dbRepImpExpChanges, data flows both ways during a synchronization. Finally,
use dbRepSyncInternet to perform a synchronization over the Internet.
NOTE: You need the Microsoft Office 97 Developer
Edition if you want to perform data synchronization over the Internet.
NOTE: Be aware that the .MDB format
is used by the Microsoft Jet database engine. The Microsoft Jet engine is used by
both Visual Basic and Microsoft Access. It is common practice by Access developers
to store data, forms, reports, and queries in the same .MDB file. When you
synchronize, changes to forms or reports contained within the database are also synchronized.
Resolving Synchronization
Conflicts
Data conflicts are quite common among members of a replica set. They can occur
when the same record gets changed in different replicas in between synchronizations.
This means that two different users might see two different values for the record.
How does the Microsoft Jet engine know which value should be used? Better yet, how
does it know which value to use and distribute throughout the entire replica set?
The logic that the Microsoft Jet engine uses to resolve synchronization conflicts
is simple and consistent. The replica set member that changes the record the greatest
number of times wins the conflict. If this number is equal for all the replica members
being synchronized, the Microsoft Jet engine selects the record from the table with
the lowest ReplicaID.
As you remember, the s_Lineage field stores the number of changes to a record.
This is the field that the Microsoft Jet engine examines to determine which replica
set member wins the conflict.
Load COPY.MDB into Visdata and open the Authors table. Change the first
record by changing the name of the Author in the first record from "Smith, John"
to "Smith, Copy." Now open the REPLMAST.MDB database in Visdata,
load the Authors table, and change the Name field of the first record to "Smith,
Curtis." Now save the record and close the table. Reopen the table and change
the DOB (Date of Birth) field to 9/2/64. Save the record and close the table.
You have now changed the first record of the Authors table of COPY.MDB
database once, and the same record in the REPLMAST.MDB database twice. In
a synchronization, which change do you think prevails?
To find out, close Visdata and load the REPLDEMO.VBP project. Run the
project and click the Synchronize button. When you are informed that the synchronization
is complete, close the project by pressing Exit.
Return to Visdata and load COPY.MDB. Open the Authors table and notice
that the first record is updated based upon the values that were entered into the
REPLMAST.MDB database. That is to say, the Microsoft Jet engine knows that
this record changed more times in the REPLMAST.MDB files than in COPY.MDB,
and therefore chooses that record as the one to use in the synchronization.
But what happened to the change made in the COPY.MDB file? To find out,
close the Authors table, and you notice that a new table was added to this database
during the synchronization process, the Authors_Conflict table. Open this table and
you find a record with the single change. Your screen should look similar to Figure
20.16.
Open the REPLMAST.MDB database in Visdata. Notice in the Database window
that the Authors_Conflict table does not exist. The error table created by a synchronization
conflict is stored only in the table that lost the conflict. Open the Authors table,
and you should see that both changes made to the first record were preserved.
Figure
20.16. The Authors_Conflict table.
Errors That May Occur During Replication
Along with record conflicts, more serious errors can occur during synchronization.
There are several actions that may cause an error during synchronization. For example,
you can implement table-level validation rules after replicas have been created.
This is not bad in itself, but an error occurs during synchronization if you try
to replicate the rule and if a member of the replica set has entered and saved data
that violates the rule.
This same type of error may occur if you change the primary key of a table. You
could try replicating this change only to find that you receive an error when a replica
has two equal values in two separate records in the field you tried to create as
the primary key.
In both cases, you are performing serious design changes in mid-stream. You should
therefore be careful and limit the design changes you make to members of a replica
set.
An error may also occur when one replica set member deletes an entry from a validation
table that has been used by another member in updating a master record. You receive
an error when you try to import the master record into the replica set that deleted
the validation table entry. Each member by itself doesn't violate referential integrity
rules, but when combined, they do so in grand style. To avoid this situation, make
validation tables read-only to all but the Design Master whenever possible.
NOTE: Try to limit users to read-only access
to validation tables in a replicated environment.
NOTE: Try to avoid using cascading updates and
cascading deletes in your application when replication is used. These features make
it easy for you to cause a large number of synchronization errors.
You might also receive a synchronization error when you try to update a record
that is locked by another user in a multiuser environment. An entry is written to
the MSysErrors system table when you encounter such an error. To avoid this problem,
it is best to have all users locked out of a database during synchronization.
You might also receive an error if you add a new table to your database and use
the same name that another replica used for a different table. To avoid this, all
members of the replica set need to communicate all database changes.
In summary, synchronization errors can occur as a result of design changes, as
a result of violation of referential integrity rules on a consolidated basis, or
as a result of record locking by users of a replica set member. You can avoid most
of these errors by completing development before replication begins, by securing
validation tables whenever possible, and by locking the replica members involved
in a synchronization.
NOTE: Errors encountered during synchronization
are stored in the MSysErrors table. This table is replicated during the synchronization
process. Therefore, try to correct all encountered errors before they are passed
to other members of the replica set.
Replication Topologies
When you implement database replication in your application, you most likely will
make more than one replica of the original Design Master. When you do, you will be
faced with the logistical question of how and when to update replica set members.
You need to implement a schema for the order in which data updates get dispersed
throughout the replica set. The design of the order in which replica set members
get updated is referred to as the replication topology. We cover the various topologies
in this section. It is important, however, to note that there is no universal best
topology. You need to investigate the needs of your application's users thoroughly
before you can decide on which topology to implement.
The most commonly used topology implemented in database replication is the star
topology. In the star topology, there is one central database, usually the Design
Master, with which all members of the replica set perform a synchronization. No replication
occurs directly between members of the replica set. As an example, let's assume you
created a replica set with one Design Master (DM) and four replicas (A, B, C, D).
To begin, A first synchronizes with DM. Next, B synchronizes with DM, then C with
DM, and D with DM. A, B, C, and D don't talk to one another directly, but pass all
data changes through DM.
The star topology is the simplest topology to implement. It doesn't require a
strict synchronization order be maintained. Replica A could synchronize after B,
and C could synchronize before B. This is therefore a good topology to use when you
are working with a large number of replicas, such as in a sales force automation
application. Users can synchronize in this topology without having to worry about
when other members of the replica set synchronize.
There are two drawbacks to the star topology, however, of which you should be
aware. First of all, the central database with which all replicas synchronize serves
as a single point of failure. If this database is down, no one can talk to anyone
else. You should therefore be prepared to move one of the replica set members into
the central role if necessary. Remember, though, that use of a backup is not recommended
as a means of safeguarding a database in a replicated environment.
The other problem with this topology is that it permits some replicas to synchronize
infrequently, or not at all. This is actually a very common problem in contact management
databases, because some users don't see the need for sharing their entries with other
members of the replica set, or just don't get around to performing the synchronization.
NOTE: It is not realistic to believe that humans
can stick to a strict replication schedule. Or that they will voluntarily perform
a synchronization if it is difficult. If implemented in an end-user application,
synchronization must be made extremely easy to use, or it will not be used.
A linear topology can also be used for synchronization. In this topology, replica
A synchronizes with B, then B synchronizes with C, and then C synchronizes with D.
To restart the process, D would synchronize with C, and then C with B, and finally
B with A.
A ring topology is similar to a linear topology, except, the reverse track is
not performed. In this scenario, replica A synchronizes with B, B synchronizes with
C, and then C synchronizes with D. Replica D then restarts the process by synchronizing
with A, and then A synchronizes with B, and so on.
The linear and ring topologies are good in that they do not have a single point
of failure. They are bad in that the synchronization can be stopped, or delayed if
one member goes down. Also, the transfer to other members of the replica set is slower.
In a linear topology, a change to C would have to go first to D, then back to C,
and then to B before it is sent to A. This is a total of four synchronizations.
The fourth topology that can be used in a replicated database structure is referred
to as the fully connected topology. In this scenario, replica A synchronizes directly
with B, C, and D; replica B synchronizes directly with A, C, and D; replica C synchronizes
directly with A, B, and D; and D synchronizes directly with A, B, and C. This topology
requires the greatest amount of work, and should be used in applications that require
constant availability of data.
NOTE: You might want to reconsider the use of
database replication in your application if you are using the fully connected topology
to guarantee data availability. Web-enabled applications with centralized data may
be a better solution.
The topology you ultimately choose for your application depends on the timeline
requirement of data. If this is unknown, start with the star topology and make changes
as necessary.
Keeping Database
Elements from Replicating
There might be some data tables that you do not want to replicate to other members
of a replica set. This might be the case with data that is highly sensitive in nature,
or data that is of little value to other replicas. For example, you might want to
replicate general employee information to remote offices of your organization, but
you might not want to distribute payroll information outside of the main office.
Or, you might not want to replicate a table of office supply vendors used by your
California office to your office in Vermont.
In the following example we create the KeepLocal property for the Authors table
of a new database named KEEPLOC.MDB. This file can be found in the \\TYSDBVB5\SOURCE\DATA
directory on the CD that shipped with this text. We then convert this database into
a Design Master and make a replica named COPYKL.MDB. This replica does not
have the Authors table as part of its object collection.
Start this exercise by loading the REPLDEMO.VBP project into Visual Basic
5. Add a command button to the form REPLDEMO.FRM. Set the Name property
of this button to cmdKeepLocal, and its Caption to &Keep Local. Now add the code
in Listing 20.5 to the cmdKeepLocal_Click event.
Listing 20.5. The
Visual Basic 5 code to keep a table object from replicating.
Private Sub cmdKeepLocal_Click()
Dim dbMaster As Database
Dim LocalProperty As Property
Dim KeepTab As Object
Dim repProperty As Property
`Open the database in exclusive mode
Set dbMaster = OpenDatabase("c:\tysdbvb5\source\data\keeploc.mdb", True)
Set KeepTab = dbMaster.TableDefs("Authors")
Set LocalProperty = dbMaster.CreateProperty("KeepLocal", dbText, "T")
KeepTab.Properties.Append LocalProperty
KeepTab.Properties("Keeplocal") = "T"
MsgBox "The Authors table is set to not replicate"
`Create and set the replicable property
Set repProperty = dbMaster.CreateProperty("Replicable", dbText, "T")
dbMaster.Properties.Append repProperty
dbMaster.Properties("Replicable") = "T"
`Display a message box
MsgBox "You have created a Design Master out of KEEPLOC.MDB!"
dbMaster.MakeReplica "c:\tysdbvb5\source\data\copykl.mdb", "Replica of " & Â"dbMaster"
dbMaster.Close
MsgBox "You have created a copy of KEEPLOC.MDB"
End Sub
This code first opens our database, KEEPLOC.MDB, and sets the KeepLocal
property of the Authors table to T. Note that the KeepLocal property must
be set before the Design Master is created. The program then turns KEEPLOC.MDB
into a Design Master, and creates a replica named COPYKL.MDB.
Now run the application and select the Keep Local Command button. You are prompted
with a Message Box when the KeepLocal property is set to T for the Authors
table, when the KEEPLOC.MDB database is converted into a Design Master,
and when the COPYKL.MDB file is created. Finally, select Exit to unload
the project.
After the program is completed, open the KEEPLOC.MDB database in Visdata.
Expand the Authors table object in the Database Window and then expand the Properties
of the Authors table. Notice that the KeepLocal property is set to T. This
is illustrated in Figure 20.17. Open the BookSales table. Notice that there isn't
a KeepLocal property.
Figure
20.17. To prevent this table from replicating,
the KeepLocal property was created
and set to T.
Now open the COPYKL.MDB file in Visdata. Notice that this database does
not have an Authors table. You have successfully made a replica of the Design Master
and excluded a table!
NOTE: Objects created after a replica is created
do not flow to other members of the replica set. You must first set their Replicable
property to T to replicate them.
Summary
In database replication terminology, the main, or central, database is referred
to as the Design Master. A copy of the Design Master is referred to as the replica.
The combination of the Design Master and all replicas is referred to as the replica
set. Database replication is the process of synchronizing data so that it is the
same across all members of the replica set.
Database replication is a good tool to use in the development of systems deployed
across a WAN or to remote users. Replication can also be used to make copies of databases
that cannot be shut down. Replication is also good for creating reporting databases.
Do not use database replication when a centralized data storage facility can be
used, such as a Web-enabled application. Also, don't use replication in heavily transaction-oriented
applications, or in applications where up-to-the-minute accuracy is of paramount
importance.
Tables, fields, and properties are added to a database when it is made a Design
Master. The addition of these items is necessary to track changes to data and to
facilitate the synchronization between members of the replica set. These additions,
however, consume additional hard drive space.
Creating and changing the Replicable property of a database to T creates
a Design Master. Once the Design Master is created, you can use the MakeReplica
method to make copies of it. Finally, you use the Synchronize method to
replicate data changes to members of the replica set. Data synchronization is the
act of copying data changes from one member of a replica set to another.
The Synchronize method can be used to import data changes, export data
changes, perform "two-way" data changes, and even perform data exchanges
over the Internet.
Synchronization errors occur when two members of a replica set try to synchronize
records that each has changed. Errors can also occur during the synchronization process
when design changes are made to a database but violated by replicas prior to synchronization
of the changes. Violation of referential integrity can be encountered by replicas
that add records to a database that uses validation records deleted in another replica.
Record locking in a multiuser environment can also cause synchronization errors.
There are four topologies for the synchronization of replicas. These are the star,
linear, ring, and fully connected topologies. The star topology is the most common,
but like all the other topologies, has certain strengths and weaknesses.
There may be times when you do not want to replicate objects contained in one
database to other members of the replica set. If such is the case, use the KeepLocal
method before you create the Design Master. This method keeps the object from being
copied to other replica set members.
Quiz
- 1. Define database replication.
2. Cite examples of applications that can make good use of database replication.
3. Cite examples of systems in which database replication should not be used.
4. What fields are added to all data tables when a database is turned into
a Design Master?
5. What properties are added to the database during the creation of the Design
Master to indicate that it can be replicated?
6. How much hard drive space is consumed by a database when it is turned into
a Design Master?
7. What happens to an AutoNumber field when a database is turned into a Design
Master?
8. What method do you use to create a copy of a Design Master?
9. What is the logic that the Microsoft Jet engine uses to resolve synchronization
conflicts?
10. What topologies can be used for database synchronization? Which topology
is the most commonly used?
11. What method do you use to keep database objects from replicating to other
members of a replica set?
Exercises
Design an implementation strategy for the rollout of a database application that
you built to track and deliver your company's employee information. This application
needs to be installed at your corporate office in Cincinnati, and then delivered
to offices in Chicago, Los Angeles, and New York. Use the following information as
you design your strategy:
- The database is named EMPLOYEE.MDB and has four tables: EmployeeMaster,
EmergencyInfo, Education, and SalaryInfo.
- All payroll is performed in the Cincinnati office.
- Updates need to be made at each site, and shared with all other sites. The order
in which updates are made to the database is not important.
- Cincinnati is the largest office. Chicago is the second largest, and is three
times the size of the Los Angeles or the New York offices.
Include the following items as part of your implementation plan:
- Names of the tables to be distributed to each site
- Backup methodology
- Synchronization topology
- Code to keep the payroll information (SalaryInfo) from replicating
- Code to create the Design Master
- Code to create the Chicago replica
- Code to synchronize your Chicago and your Cincinnati databases
  
|