This is a replicated copy of Enhance Business Intelligence with Data Replication Solution. It is provided for completeness and avoid "broken" link.

IBM Shop IBM Support Download
HomeNewsProductsServicesSolutionsAbout IBM
  AS/400 Home
  My AS/400 Site
Table of Contents


Enhance Business Intelligence with AS/400 Data Replication Solution

Richard Sinn

Document Index

Competitive advantages can be obtained by consolidating and analyzing your data for better business decisions. Learning to uncover and leverage those advantages is what business intelligence is all about. Enabling your business intelligence is made possible when you have the right applications and tools to analyze data, and most importantly, when the data is prepared in a format suitable for analysis. IBM data replication solution delivers just that–enterprise data where you want it, the way you want it. This article describes the inner working of IBM DataPropagator Relational Capture and Apply for AS/400 and IBM DataJoiner. With the AS/400 as the core e-business server, a complete replication solution can be provided with integrated DataPropagator and DataJoiner running on an NT IPCS. This empowers the AS/400 as the single point of control to your data.

In today's business enterprises, the demand for current data is a key concern. To make informed decisions, you need to deliver data into the right hands, in the right format, and at the right time. IBM's data replication solution supports business plans including movement to client/server environments, data warehousing, and business reengineering. Data replication is the key technology for harnessing the full power and potential of distributed database environments. IBM DataPropagator satisfies these replication requirements and more. It delivers data where you need it!

Data Propagator Relational 5 for AS/400 and other IBM DataPropagator products running on NT, OS/2, MVS, Solaris, HP/UNIX, SCO Unix, Linux and AIX are the core components of IBM's data replication solution. With DataPropagator, you can unite your distributed relational databases into a cohesive and integrated database solution. It automatically captures your data changes against an IBM source database and propagates those changes to any specified IBM target database, keeping the two consistent.

Compared with high availability products that run synchronously linking two systems together to provide hot backup, replication runs asynchronously to allow flexible scheduling when replicating data files. For example, you could configure DataPropagator to replicate data every 6 hours and each replication cycle is bounded to 30 minutes. In other words, DataPropagator will only wake up every 6 hours and work for 30 minutes at a time, so you have full control of how much processing power you wanted to delegate at a specific system on a specific timing window. Besides, replication can link multiple systems together for update-anywhere. For example, you could link ten systems with update-anywhere replication and whenever there are database file changes in any one of the systems, the changes will automatically flow to the other nine systems.

Another main component in the Replication solution is IBM's DataJoiner. DataJoiner provides transparent access for customers with multiple homogeneous or heterogeneous data sources. It will also work with DataPropagator to replicate data among different heterogeneous data sources (DB2 UDB, Oracle, Informix, Microsoft SQL Server and Microsoft Jet).

With AS/400 as the core e-business server in a business intelligence environment, Data Propagator/400 can run natively on the AS/400, and DataJoiner can run on the AS/400 NT IPCS. This perfect combination provides unmatched functionality and enables the AS/400 as the focal point for replication solution. The AS/400 can provide robust, flexible, and efficient data replication that can maximize the accessibility of reliable data across any enterprise for transactional applications, business intelligence systems, and distributed client/server applications. This article describes the AS/400 data replication solution support and some of the unique functions provided by the AS/400 such as remote journal replication, relative record number support, and integration with DataJoiner on NT IPCS.

DataPropagator Relational 5 for AS/400 and Replication Solution Overview
DataPropagator Relational Capture and Apply for AS/400 is an IBM solution for moving data into the data warehouse. DataPropagator moves data from DB2 UDB for AS/400 operational systems as well as from any other DB2 family database. DataPropagator allows you full SQL capability as you move the data-an important feature for creating summarized data, derived fields, and multiple summarized data marts or data warehouses.

Being able to capture the data from the operational systems and to propagate it on a timed basis is another benefit of DataPropagator. Your business will dictate how often the data warehouse needs to be refreshed: once a day, once a week, or every five minutes. The automated apply process handles this so you need only configure DataPropagator once, and the data warehouse is refreshed automatically.

Other IBM products that can extract data from non-DB2 data sources include DataJoiner, DataPropagator Non-Relational, and Data Refresher. These products can be used in conjunction with DataPropagator to pull data from IMS, VSAM, and non-IBM relational databases.

DataPropagator offers full and differential refresh of data between DB2 UDB for AS/400 and other IBM DB2 systems. Differential refresh setup enables DataPropagator to replicate only the database changes (insert, update, delete) from one system to another. For example, if you update 5 rows in a 1 million rows table, only 5 rows worth of changes will be replicated across systems. In a full refresh setting, the whole database table will be replicated whenever there is a database change. DataPropagator provides robust, flexible and efficient data replication that can maximize the accessibility of data across any enterprise whether for transactional applications, business intelligence systems, and distributed client/server applications.

DataPropagator and other IBM Data Replication products provide replication solution designed to help you meet the growing demands for data across your enterprise by delivering the right data in the right format into the right hands at the right time. You can use DataPropagator to do the following:

  • Reengineer business processes: Move important data and applications to the place of the business transaction to improve business cycle times and customer responsiveness. A lot of times, these applications must share data with legacy applications on the host. DataPropagator replication solution can automatically deliver the shared data to the target platform, accommodating data restructuring and high-currency requirements and minimizing network load and availability requirements. Then, you are able to place the application locally to the business process (for example, point of sale, inventory, and shipping). Furthermore, you can replicate the data to and from central servers and other sites where business processes there need to cooperate.
  • Improve decision-making: Empowering your employee's decision-making authority increases the company competitiveness directly. To give employees the information they need to make effective, accurate decisions, businesses are building decision-support databases and data warehouses. These new decision-support databases assist in day-to-day decision-making activities, from determining what items to stock in various stores, to identifying customer sets for new products. DataPropagator replication technology can support decision-support applications by improving data availability, data access performance, and data usability.
  • Increase online throughput: To accommodate business growth, you may need to off-load query processing to make room for increasing transaction processing requirements. Off-loading query processing reduces contention that impacts online transaction processing (OLTP) response time. DataPropagator is the perfect tool to off-load your data from the production system to different targets systems.
  • Improve system availability: Businesses are being driven closer and closer to 24-hour-day availability to maintain competitiveness and to satisfy customer demands. This puts significant pressure on the batch window. The batch window often accommodates planned outages such as database reorganization. DataPropagator can create backup systems where access can be redirected during such planned outages. DataPropagator can help eliminate extract processing from your batch workloads by replicating data in the background throughout the day.
  • Support audit requirements: You may be faced with increasingly stringent audit requirements. Years of information must be kept in detail with who changed what and when. DataPropagator can help provide necessary easily understood audit information, including user ID, transaction details and other archive histories.
  • Broad legacy data support: Broad legacy data support lets businesses leverage existing assets for client/server environments: DataPropagator can work hand in hand with DataRefresher and DataPropagator Nonrelational (IMS) to bring data from IMS databases, DB2 databases, VSAM datasets, and flat files to the DB2 family of databases (across MVS, VM, VSE, AS/400®, AIX®, HP-UX, Windows NT®, the Solaris operating environment, and OS/2® platforms).

Let's take a look at the different components of DataPropagator.

DataPropagator Components Overview
DataPropagator provides asynchronous replication: the ability to copy changes from one table to another. The key technology is that DataPropagator can replicate "changes". There are three autonomous components: AS/400 Capture, AS/400 Apply, and the Administration component. Both AS/400 Capture and AS/400 Apply are components of the DataPropagator Relational Capture and Apply for AS/400 license program. The installation process of DataPropagator will install Capture and Apply programs, create all the internal control tables and internal database triggers without any manual configuration from the operator. Different types of administration components are available to fit different enterprise computing environment, we will discuss them in detail later in the article

Figure 1. DataPropagator - Capture

Figure 2. DataPropagator - Apply

Figure 1. DataPropagator - Capture and Figure 2. DataPropagator - Apply show the high-level working of AS/400 Capture and AS/400 Apply. The AS/400 Capture program captures changes from DB2 UDB for AS/400 using database journal entry. The AS/400 Apply program copies these changes to the target tables. The administration component sets up the replication environment. There are three logical servers in DataPropagator architecture. Server in this context can be defined as logical system holding a specific set of information. The source server contains the source table, the target server contains the target table, and the control server contains the replication control information. Source table is the database table being watch by AS/400 Capture. All the changes of the source table will be replicated to the target table. Please note that all the servers discussed are logical because they could all physically be on the same system or they could be on separate systems. The AS/400 Capture component normally resides at the source server and the AS/400 Apply component can reside anywhere in the network.

AS/400 Capture
The AS/400 Capture component captures changes made to the base tables and puts them into internal staging tables. It captures the changes by reading the database journal. Source tables can be copied based on the changes to the tables (differential refresh) rather than always copying the entire tables (full refresh). The AS/400 Capture, see Figure 1, timestamps all the changes using DB2 UDB for AS/400 journal entries and maintains transaction consistency at the same time. AS/400 Capture then receives database file changes through journal entries and automatically updates DataPropagator internal staging tables.

AS/400 Apply
AS/400 Apply is the run-time copy manager that allows a database to be replicated. The Apply program runs locally on the copy server: it connects to the data servers and copies source tables to the local database. The AS/400 Apply program performs, see Figure 2, full or differential refresh of a target table based on source-to target rules, which are derived from subscription information and information generated by AS/400 Capture.

Figure 3. Control Center (CC)

Figure 4. WebCC

Figure 5. DataJoiner Replication Administrator (DJRA)

Administration components
Administration components allow an administrator to setup replication. There are three different sets of graphical interfaces available for customers. They can pick the most suitable administration tool to use based on computing environment. The first one is called the Control Center(CC), see Figure 3, which is integrated with IBM Universal Database Version 5. It is available on NT and OS/2 operating systems. Another one is called WebCC, which is a Java version of the Control Center, see Figure 4, it provides the look and feel of the Control Center but takes advantage of the Java capability to run on multiple platforms. A single user license of DB2 Connect Personal Edition is shipped with DataPropagator Relational 5 for AS/400, it will contain both WebCC and Control Center licenses. The other one is called DataJoiner Replication Administrator (DJRA), see Figure 5. It is integrated with DataJoiner and is also available on the Web free of charge to licensed DB2 and DataJoiner users. DJRA also ships with DB2 UDB free of charge. DJRA runs on the NT and Windows 95 operating systems. The replication setup is the same in both the interfaces, the big difference is that you need DJRA to set up heterogeneous replication. DJRA also provides some functions that are not available in Control Center yet.

AS/400 Data Replication Scenario
This section demonstrates using DJRA as the administration component for AS/400 data replication. Two major steps should be performed before starting AS/400 Capture and AS/400 Apply:

  1. Define tables that are replication sources. These replication sources are called "source table."
  2. Define "subscriptions" to those replication sources.

Define replication sources
Use DJRA to select the AS/400 system where the source tables are located. Select the before and after image of the column when defining a replication source. The process of defining a replication source is called "registration" in DataPropagator terminology. Users only register a source table once regardless of how it is going to be used for distribution. The DJRA will internally create a few control tables for the user and one of them is called the change data (CD) table. Such a table is also called the staging table. The CD table stages all the changes from the source table and waits for AS/400 Apply to apply the changes. In order to fit different business intelligence applications, DataPropagator provides vertical fragmentation of data. Thus, you can select only the required columns to replicate instead of the whole table. In addition, you might want to select the before image with the after image of the column for auditing purposes. For update-anywhere replication, both the before image and after image are required. Two other options are available if you are using DB2 UDB for AS/400. They are "partitioning key support" where an update is broken into a delete/insert pair, and level of "conflict detection" is also available to select at the source site for update-anywhere replication.

Define subscription
After defining registration, the list of replication sources are shown in the DJRA. We can then use DJRA to define a subscription based on a replication source. Subscriptions can be viewed as the relationship between source tables and target tables. Many subscriptions may refer to one source table. During this process, you can further subselect the columns to refine your replication criteria. Advanced functions such as SQL Before/After and Timing can be set with your subscription. After the subscription process, a list of replication sources will be available in the replication subscription session of DJRA.

AS/400 Capture
There is one Capture program per AS/400 system. Compared with other log-base platforms such as AIX, MVS and NT, AS/400 uses journals as the vehicle for tracking database changes. In most cases, a named journal is created in an AS/400 library and works with one journal receiver at a time. A journal can be viewed as a controller of a journal receiver. When operations (such as SQL insert, update, delete and other AS/400 operations) are applied to database objects, the changes (known as journal entries in AS/400) on the database objects are sent to a journal receiver. In a DataPropagator environment, AS/400 Capture will read from the database journal and thus capture the changes, and insert them as rows in the Change Control (CD) table. There is at most one CD table per registered source table. Since commitment control is assumed by DataPropagator, an internal table called ASN/IBMSNAP_UOW (Unit-Of-Work) table is used to post committed records. All the control tables are created at installation time or by the administration component. You could use the Start DPR Capture command to start AS/400 Capture. If new source tables get registered while the Capture program is running, you can run the Initialize DPR Capture command for Capture to recognize newly registered tables.

Internally, the Capture program uses log sequence numbers to communicate with the Apply process through the UOW and CD tables. The Apply program can determine up to what point the journal changes have been captured. CD tables tend to get very large; therefore, the Capture program has a built-in pruning mechanism for CD and UOW tables. All pruning by AS/400 Capture is automatic without user intervention. From the Capture command, a retention limit can be specified (for example, 7 days) and the Capture program then prunes the records from the CD and UOW tables even if they have not been copied by the Apply program. This normally should not happen in normal replication scenarios unless a subscription has been deactivated. Another word for sequence numbers is "synchpoints" in the DataPropagator environment. As mentioned earlier, the Capture program communicates with the Apply program via these transaction boundary synchpoints. In case of failure due to communication error or other system error, AS/400 Capture can be warm-started. A warm-start means that the program remembers where it was in the Journal processing when it went down. During warm-start, The Capture program computes all the warm start information from an internal table called IBMSNAP_WARM_START. Again, no extra configuration or administration from users is needed.

AS/400 Apply
The AS/400 Apply program is an SQL application. Its main job is to copy data from source to target tables. It can also run SQL statements and call stored procedures, useful in modifying data and for signaling with other programs, subscriptions, and Apply processes. Before the changed data can be copied to target tables, the target tables have to be synchronized with the source table; therefore the Apply program does a full refresh of the target table. In other words, all the data in the source table will be copied to the target table at replication start time. After a full refresh is done, the Apply program copies the changes from the join of the CD and UOW tables so that only the committed data gets propagated. The Apply program can be run from any server in the network. It can be used to push or pull data. Pull indicates that Apply is run from where your target tables are with the logical target server. Push means running Apply at a server other than the target server and push data into the target tables. In a Push scenario, Apply can run either on the source server or control server. The choice of push or pull depends on computing environment setting and can have some impact on the performance. Multiple Apply jobs can be run on the same AS/400. Each Apply job is identified by the Apply_qualifier. The Apply_qualifier is defined in the subscription process using the DJRA.

Each Apply job reads subscription definitions from the control server. Apply jobs can share a global control server or have individual control servers. The ability of having multiple control servers increases the flexibility in a distributed computing environment. On the other hand, central management administration favors having just one control server. Better performance can be achieved if the Apply job is run from the control server. Multiple Apply jobs can be used to serve the same target server in order to increase throughput.

Replication features to support distributed computing
The following advanced features are available in Data Propagator/400 to service different companies in enterprise computing.

  • Subscription sets
    When you define the subscriptions, you can group the target tables into a set. A subscription set is tied to an Apply qualifier, so a subscription set is processed by one Apply process. However, an Apply job can process many subscription sets. The important thing to remember is that all the members in a subscription set share the same fate. If one member needs a full refresh, maybe because of gap detection, each member in the set will have a full refresh or if one member gets rolled back, the whole subscription set will be rolled back. Thus, all related target tables in a set are committed in a single unit of work. All the target tables in a subscription set also get serviced in one connection to optimize performance.

  • Time and event-based replication
    You can select the type of replication you want: relative timer, event based, or both. If you select a relative time, you can also specify the frequency. The Apply program services these subscriptions in a round-robin method. That means it will not always start the subscription at the same time due to network delays. Use event-based replication if you want your subscription set processed when a particular event occurs. You can specify the EVENT_TIME (a timestamp indicating the time when the Apply processing begins), the time to move the data and the END_OF_PERIOD, which means that you want committed transactions copied at or before this time.

  • Build-in programming capability
    You can specify SQL statements or a call stored procedure to be run at the source or target servers before or after the Apply program processes the subscription set. You can define run-time processing statements using SQL statements and stored procedures before and after the Apply program processes the replication subscription. The run-time processing statements can be run at the source server before the answer set is retrieved, and at the target server before and after the answer set is applied. The stored procedures use the SQL CALL statement. Acceptable SQLSTATEs can be defined for each processing statement as well.

  • Further replication flexibility with ASNDONE and ASNLOAD
    You can use the ASNDONE user exit to tie in an application depending on the status returned from the Apply program. For example, based on the status returned, you might want to enable or disable another subscription set, or you might want to post an event in the event table.

  • DB2 source view replication
    Not only regular physical file can be used as registration but also DB2 UDB for AS/400 view as well. This powerful capability easily extends the advantages of your normalized operational databases to sophisticated replication cases.

  • Multiple target table types
    There are many types of target tables to fit your data Warehouse or information processing applications.

    The target type can be as follows:

    • Read-only copies
    • User copy tables, which represent source data at a particular point in time
    • Point-in-time tables, which represent source data at a particular point in time plus some overhead columns
    • Aggregate copies, with SQL column functions either over the base or the change data table.
    • Consistent change data (CCD) tables for further replication extension:
      CCDs are consistent change data tables maintained by Apply. A CCD is a database join of the CD table and UOW tables. When Apply has to propagate changes, it does a database join between a CD table and a UOW table, and the join operation is sometimes costly. As an alternative, a CCD subscription can be defined and further subscribed to multiple read-only copies. Another way of using CCD is to define one as complete and non-condensed. In other words, it maintains the complete histories of data changes. The CCD table architecture is published by IBM and, could be used as a loading dock for non-relational data. For example, IBM DataPropagator Non-relational product can deliver the IMS change data into the CCD table, and this CCD table can be registered as a replication source and subscribe to multiple read-only copies. CCD is also used as the target type when replicating between DB2 UDB for AS/400 and NotesPump from Lotus. NotesPump provides a DataPropagator activity, which can be defined by the NotesPump administrator to replicate data between DataPropagator-enabled data to Notes databases.
    • Replica:
      Data Propagator supports an update-anywhere replication capability, which provides rigorous conflict detection and automatic compensation for offending transactions, allowing you to maintain the integrity of your primary database and its many distributed replicas. In update-anywhere replication, you run one Apply process to pull changes to the non-master table called replicas and push changes from the replicas at the same time. The Apply component running at the replica site detects update conflicts after they occur during the subscription cycle. When transactions are rejected, the Apply program compensates the transactions at the replica site. During this process, the Apply inserts rejection codes for every rejected transaction in the Unit-Of-Work (UOW) table. Conflict detection is provided at three levels: no detection, standard detection, and enhanced detection. When a conflict is detected and compensated, the replication subscription is considered failed. All related transactions are checked for conflicts and are also rejected. All the detection is handled by the DataPropagator system without user intervention.
    • Blocking factor for computing resource management

Depending on computing environment, it is not uncommon to have a lot of data needed to replicate by the Apply program. Using the blocking factor, the Apply program splits the answer set by timeframe instead of size. The timeframe is called max_synch_minutes and is defaulted to 30 minutes. As a result, Apply copies data 30 minutes at a time. If the answer set is too large, the Apply program automatically divides it into smaller blocks.

Remote journal and relative record number replication

Figure 6. Remote Journaling

Remote journaling provides high-performance (microcode level) journaling of database record changes. Supported on a system level on the AS/400, remote journaling involves storing journal receivers on a remote system, providing better plumbing for data replication and high availability solutions. A remote journal can transport journal entries over SNA- or TCP/IP-based networks, in addition to clustered AS/400 configurations using OptiConnect. DataPropagator takes advantage of the AS/400 remote journal capabilities. Its support for remote journaling allows it to more quickly replicate data changes to your AS/400 data warehouse, as well as reduce the load on your transaction system. With remote journal, Apply and Capture can run on the same system and free up any resource used by DataPropagator on the operational AS/400 system. The Apply program will also have local access to a staging area that eliminates network and file processing.

Relative record number replication support is also available to address the problem where a unique key could not be found for a copy table. It allows a user to have a unique key for CCD and point-in-time target tables when no combination of target columns is unique.

Using IBM DataPropagator
To recap the replication process, AS/400 Capture reads the register table to find out which source tables it needs to start capturing changes for. It then reads the database journal and inserts necessary rows in the CD table and UOW table. There is one CD table per source table, and one UOW table per source server. On the other hand, the Apply program reads the subscription table to identify which subscription sets belong to an Apply qualifier, and it reads the register table to identify the source tables from which it needs to copy data from. AS/400 Apply then communicates full refresh information with AS/400 Capture using one of the staging tables' synchpoint values. After full refresh, the Apply program starts copying from the join of the CD table and the UOW table, and updates the synchpoint to indicate how far it has copied. These synchpoints are also taken into account when the Capture program prunes the CD and UOW tables.

Multiplatform Replication
So far we have only discussed replication between two AS/400 systems. It is very common for customers to have heterogeneous data residing on different platforms. Replication over heterogeneous data sources is often needed to deliver data where you want it. If you deploy an NT Integrated PC Server (IPCS) with your AS/400, heterogeneous data access can be achieved by using DataJoiner.

IBM's DataJoiner provides DB2 UDB for AS/400 with access to a wide variety of databases; this includes the DB2 Family as well as non-DB2 databases such as Oracle and SQL Server. DB2 UDB for AS/400 utilizes Data Joiner's capabilities with simple DRDA SQL requests. Once server mappings and nicknames have been defined in DataJoiner, the AS/400 SQL requests will look identical to accessing a remote DB2 UDB for AS/400 server. IBM DataJoiner can be run with the AS/400 on an NT IPCS: the combination of DataJoiner on NT IPCS and AS/400 gives a new meaning to database middleware. It not only provides heterogeneous data access but also lets you do joins and to update data from different sources. Most importantly, running DataJoiner on NT IPCS makes the AS/400 the single point of control for your data. You can access heterogeneous datasets while enjoying the integration maintenance features of IPCS at the same time.

With DataJoiner running on the NT IPCS, DataPropagator can use its capabilities to transparently exchange data with non-DB2 data sources. This integrated solution makes it easier to populate your AS/400 data warehouse with any business information in your organization, regardless of what database or server is storing your information.

Key features of DataJoiner includes:

  • Access to non-DB2 data sources. With DataJoiner there is only one SQL interface to deal with. DataJoiner databases look and act like DB2 to users or application programs. Programmers do not have to learn different data sources and different commands.
  • Ability to join IBM or non-IBM data.
  • Better performance with optimization in global query rewrite. The Datajoiner optimizer considers the relative Central Processing Unit (CPU) speed of each data source, the relative I/O speed, and the relative network bandwidth. It then combines these factors to decide which code path to execute.
  • Ability to update data on multiple systems using a single transaction.
  • If used with DataPropagator, DataJoiner has the ability to replicate from DB2 or non-DB2 sources to DB2 or non-DB2 sources. In other words, deliver any data, anywhere you want.
  • DB2 Spatial Extender allows you to query the spatial information. DataJoiner allows the mixing of spatial data and traditional data in the same table, which is stored locally on DataJoiner. It recognizes all DB2 extender features.
  • In addition to simplifying the data manipulation language (DML) against heterogenous data sources, DataJoiner also provides data definition language (DDL) transparency. You can issue a DDL against Oracle, using DB2 syntax, and DataJoiner will translate it for you.
  • More powerful applications can be built since DataJoiner supports the joining of multiple databases.

Figure 7. Data Replication Solution

Putting DataPropagator with DataJoiner provides a complete Data Replication solution for small and big companies. DataPropagator with DataJoiner allows you to replicate data from or to non-DB2 sources. The non-DB2 Capture program is trigger-based instead of journal-based. Instead of using CD and UOW tables to store the changes, it uses condensed CCD tables. The Apply program is identical to the one used in homogeneous replication. It is packaged with DataJoiner for Windows NT or for AIX. There is no code change to the Apply program because it works in the nickname namespace, and all the data and tables look like DB2 to the Apply program. Instead of using the Control Center, you use the DataJoiner Replication Administrator (DJRA), see Figure 5, for setup and administration tasks. DJRA is packaged with DB2 UDB and with DataJoiner for NT or AIX. You perform the same steps as you did for homogeneous replication: create control tables at the logical source server, control server, and target server. When DJRA recognizes that DataJoiner is defined to go to a non-DB2 source (such as Oracle or Informix), it creates a register and other staging tables at the source server using the set pass through function of DataJoiner. It then creates the nicknames for replication control tables in the DataJoiner database. During the registration process, the triggers and the CCD table get created. Then, the nickname for the CCD is created in the DataJoiner database. During the subscription timeframe, if the target server is non-DB2, a target table is created using the set pass through function, and a nickname is also automatically created for it in the DataJoiner database.

Since logs or journals of non-IBM DBMS cannot be read by DataPropagator, changes are captured using triggers. Triggers are SQL code that are invoked when insert, update, or delete statement are processed. All source tables that are registered using DJRA have triggers created for them that wait for insert, update, and delete events. Whenever one of these events is committed, the trigger code becomes active and inserts the rows in the CCD table. You can select certain columns to be replicated, and you can select the before image of each column in addition to the after image.

In heterogeneous replication, there is no Capture program. As a result, an update trigger on a staging table is used to prune the CCD tables. After the subscription set is processed, Apply updates the synchtime of the ASN/IBMSNAP_PRUNCNTL table. This trigger logic will check the CCD and prune the data that already has been copied by Apply. In the homogeneous environment, the key point of the handshake between Capture and Apply is via the synchpoint. In heterogeneous replication, Apply posts hexadecimal zeros in the synchpoint column of the IBMSNAP_PRUNCNTL table before it does a full refresh, the update trigger on the IBMSNAP_PRUNCNTL table looks for the hexadecimal zeros and updates to the sequence number up to the point that the triggers have captured the committed data.

The Apply program needs to know the upper limit of the data captured at the source server. The upper limit is to be in the synchpoint of the IBMSNAP_REGISTER table. This is done via the update trigger on another DPROPR global table called, IBMSNAP_REG_SYNCH. This table is only for the non-DB2 sources. The update trigger updates the synchpoints of the register table with the next sequence number. The event that triggers this trigger in turn uses the SQL programming capability of the DataPropagator for processing the data changes. During the subscription process, when DJRA is aware that source is Oracle, Informix, Sybase, MS SQL Server, it will store an update statement against the IBMSNAP_REG_SYNCH table with a column name called trigger_me. Apply executes this statement before it opens its cursors to read the IBMSNAP_REGISTER information. Therefore, Apply gathers the information from the register table and gets the upper limit of the CCDs.

Figure 8. Heterogeneous Replication Process

Figure 8 shows the complete heterogeneous replication process for non-DB2 sources. There are update, insert, and delete triggers on every registered source table. These triggers insert rows in the CCD table. There is one CCD table per source table. Three replication control tables are created in the data source. There is an update trigger on the IBMSNAP_PRUNCNTL table, which is a recursive trigger that does the initial handshake and then prunes the CCD table. The update trigger on the IBMSNAP_REG_SYNCH updates the synchpoint of the register table. In a DataJoiner database there is a nickname for each source table. The Apply program uses this nickname to do a full refresh. A nickname is created for the CCD table as well. The Apply program does the differential refresh from that nickname. It also issues the update statement against the nickname ASN.IBMSNAP_REG_SYNCH table.

Putting it together on the AS/400
Finally, you can realize the true potentials of your data! DataJoiner provides the gateway to heterogeneous data environments. IBM DataPropagator and other IBM replication products in the same architecture provide DB2 replication solutions. Putting IBM DataPropagator and DataJoiner together enable an advanced comprehensive replication solution to move your data from anywhere to anywhere. With the AS/400 as the core e-business server, DataPropagator can be integrated and DataJoiner can run on an NT IPCS. This empowers the AS/400 as the single point of control to your data. You not only can access all kinds of data from the AS/400, but also enjoy a lower cost of ownership with the integration maintenance features of NT IPCS.

IBM DB2 Replication Guide and Reference at

IBM DB2 DataPropagator Home Page at