A REVIEW ON DATABASE MIGRATION STRATEGIES, TECHNIQUES AND TOOLS

A Relational Database Migration (RDBM) has constantly complex, tedious, and amplified procedure because of heterogeneous structures and information sorts of RDB. This paper shows the sorted writing audit of existing relocation methods, overviews the interpretation strategies, furthermore talks about specialized issues for making the movement process compelling. This review assesses the impact of existing migration strategies and shows how it has been designed according to recent trends. In software industries plenty of Database Migration Tools (DBMTs) are available, but finding effective one is still risky task. Hence, we have reviewed some of Database Migration Tools (DBMTs) and suggested several basic criteria for evaluating migration toolkits. The discussion of these issues will help the researchers and also database practitioner in planning migration projects.


INTRODUCTION
Migration of Relational Database (RDB) means that data from source RDB to target RDB which includes schema translation and data transformation. Since 1990 so many approaches starts with legacy migration after one decade due to growth of information technology many software industries are developed their own migration process tool for their essential applications. Earlier researchers Barron C et al., (1974) summarized reasons where the purposes of database migrations are suggested. The core reason of need of migration is upgrading the existing system into developed system according to industry requirements. The impending problem is redefined of existing database and storage system in terms of complex language. In the industry rule, during the migration when the source and target databases are structurally different or data is inconsistent across multiple data sources. Due to this problem only the researches and development of migration tools are emerged in later years. Joseph R. Hudicka (1998) provides complete solution of data migration methodology for migration project. As his view the entire database migration processes are divided into several phases which are completed one by one. In his methodology, the database migration deals with row counts, columns counts and related statistics to the source database. The disadvantage of his method is does not migrate null and numeric values and error has been occurred for key constrain data fields. Relational database migration has not been subject to broad academic research for past one decades but due to tremendous advent of Open Source Database Management System (OSDBMS), subsequently the database migration workflow in proposed and which are all target system for Relational Database Migration (RDBM). Based on detailed analysis of the existing literature review on migration strategies and techniques, the successful Relational Database Migration (RDBM) among heterogeneous databases including Open Source Databases (OSDBs) are still to delve. However, this paper examines most of the database migration primitive and recent strategies, approaches, models and toolkits. The rest of this paper is organized as follows: Section 2 summarizes the impact of primitive strategies such as legacy migration and reverse engineering. Section 3 different migration strategies and techniques related to Relational Database Migration (RDBM). Section 4 presents the different types of Database Migration Toolkits (DBMTs) and evaluation criteria. Section 5 is about conclusion and future work.

Legacy Migration
Legacy relocation procedure is an essential component when completing an Information System. A significant part of the writing on database relocation is to some degree suitably fervent to legacy movement. Relocating of these frameworks can be a period serious and to a great degree extravagant errand so it makes sense that associations seek after to streamline the movement process and to make it as practical as could be allowed. These systems differ substantially from modern enterprise architectures since the presentation, business logic, and data access tiers are generally all part of the same tier. Legacy system migration often encompasses a great number of research areas including reverse engineering, business reengineering, schema mapping, application development, and translation. A technical report from CMU-SEI (Weiderman et al., 1997) has been developed an enterprise framework for the legacy migration systems. This acts as a guide for organizations planning software evolution efforts, such as migrating legacy systems to more distributed open environments, as shown in Figure 1. A legacy migration life cycle was also created, which includes the following procedures:  , 1999). Major disadvantage is to find and separate business logic from presentation and data logic and also hard to manipulate and retrieve data because of the redundancy. Legacy systems and migrations are still alive because of their distinct characteristics and good pedigree. However, it is possible to either eliminate or integrate the legacy systems by following effective migration strategy and appropriate migration tools.

Database Reverse Engineering (DBRE)
Reverse engineering is the primitive procedure generally utilized as a part of programming designing procedure. It is the first stride on relocation way through examination of the source social database framework. The importance of figuring out is the procedure of database code, documentation, and conduct to recognize its segments and their conditions to relocate and make the objectives framework ideas and configuration data (SEI, 2004). The disadvantage of this process is the subject system cannot to be modified. In this technique the source system considered as to be Data Dictionary (Source and Target Databases) and DDL (Data Definition Language). Database Reverse Engineering is deals with the migration of source database design specification (Schema information) and tasks of understanding databases.

Figure.2.2. A framework for Database Reverse Engineering (DBRE)
Database Reverse Engineering (DBRE) often used identical with reverse engineering the data structure. In software industries the reverse engineering process is a methodology to analyze all schema information of the source database system's and gains a deeper understanding of their internals by the process of DDL Extractions, and converted into logical to conceptual schema for the target database as shown in figure 2. During this process, one has to identify the volume of data to be converted, extract the current data structure (schema) as well as relation between data and system's procedural components and complete and update documentation fragments. Database reverse engineering provides a valuable analysis toolkit and points useful resources inside and outside the system that have to be examined and questioned. However, reverse Physical Schema

Conceptualization Process
Logical Schema

Conceptual Schema
Source System Target System engineering data structures still is a complex task. This is the major issue of the DBRE. Due to this complexity of data structures, successful migration is still to examine and for this issue, database analyst are forced to make arbitrary choices, to develop new technique for successful migration.

DATABASE MIGRATION STRATEGIES AND TECHNIQUES
This chapter introduces different migration strategies and techniques related to database migration. Section 3.1 discusses various strategies to database migration whereas section 3.2 discusses contemporary conversion techniques.

Database Migration Strategies:
To apply progressive database technology, software industries are faced with the challenge of migrating data and in most cases, applications as well. A several database migration strategies exist in software industries which can present information about database conversion and applications. The following table is showing a literature survey of database migration model strategies.
A ll different approaches are categorized into three basic strategies related to database migrations. First one is for handling data stored in database through OO/XML interfaces. Second one is connecting existing source relational database to a conceptually different target database system. Third is migrating completely both schema and data in source database to equivalent target database system. The first and second strategies are deal with schema translation only. The advantage of first approach is that the relational data is still accessible as relational database; the disadvantage is the inefficiency of having translated data manipulation language (DML) commands between the two layers. The second strategy is to implement more of a migration rather than simply to overlay an interface. In this case, relational technology is migrated to objects. The most significant step in this process is to derive an object-oriented scheme from a relational scheme from the existing source system. Due to substantial investments in many traditional relational databases part of their data may need to be formatted and implemented in a new different platform. Hence, constructing a gateway interface between the two databases might be preferred. Migrating to a new DBMS might be a good decision to make if the existing system is too expensive to maintain. The impact of the above three basic strategies are summarized the following subsequent sections.

Strategy 1: Migration through Object Oriented Interfaces
Data may be required to be processed in object/XML form and stored in relational form based on the concept of object for programs and database for persistence. This process requires objectto/from-database mapping techniques; such mapping is bi-directional on demand of updating a relational database using Object Oriented Interfaces. This is the reverse direction from where object-based schemas are translated into a database schema. While objects are associated via references, data in source database tables are linked through the values of primary keys and foreign keys. A single object might be represented by several tuples in several tables, and therefore, joining these tables is required for queries. This constant conversion leads to a semantic gap between the two different paradigms. To avoid this, developers have to write large amounts of code to map objects in programs into tuples in a database, which can be very time-consuming to write and execute or to use mapping query systems/middleware, which is a software layer that links OO Programming Languages (OOPLs) concepts to data stored in RDBs through ODBC or JDBC drivers. However, mapping using middleware requires time for schema mapping, on each association that stored data are accessed.

Strategy 2: Database Integration
A connection can be established between source and other target databases which allow the applications built on top of a new DBMS to access both relational and object/XML DBMSs, giving the impression that all data are stored in one database. This represents a simple level of database integration (Christine Parent and Stefano Spaccapietra, 2000) between systems. This is achieved using a special type of software called gateways, which support connectivity between DBMSs and do not involve the user in SQL and database schema. Hence, queries and operations are converted into SQL and the results are translated into target objects. Many applications use two or more underlying databases. On retrieving data from both systems, the unification of their two schemas is necessary by providing twoway mapping. During integration, systems cooperate autonomously by creating a unified and consistent data view for several databases, hiding heterogeneities and query languages. Most commercial DBMSs such as Oracle, MySQL and SQLServer provide flexibility of mapping and gateways construction among heterogeneous databases.

Strategy 3: Database Migration
Migration of a relational database into its equivalents is usually accomplished between two databases according the literature (Alhajj and Polat, 2001). The first database is a relational database, called the source, and the second, called the target, which represents the result of the migration process. In addition, the process is performed with or without the help of an intermediate conceptual representation, e.g., an ER model as a stage of enrichment. The input source schema is enriched semantically and translated into a target schema. Data stored in the source database are converted into the target database based on the target schema. Generally, relations and attributes are translated into equivalent target objects. Foreign keys may be replaced by another domain or relationship attributes. Weak entity relations may be mapped into component classes, multi-valued or composite attributes inside their parent class/entity. Other relationships, such as associations and inheritance, can also be extracted by analyzing data dependencies or database instances. In data conversion (Abdelsalm Amaraga Maatuk, 2009), attributes that are not foreign keys become literal attribute values of objects, elements or sets of elements. Foreign keys realize relationships among tuples, which are converted into value-based or object references in a target database. The challenge in this process is that the data of one relation may be converted into a collection of literal/references rather than into one corresponding type. This is because of the heterogeneity of concepts and structures in the source and target data models.

Translation Techniques:
Existing techniques can be classified into two types: (i) Source-to-Target (S2T), including flat, clustering and nesting translation techniques, and (ii) Source-to-conceptual-totarget (SCT) translation. In some of these techniques, data might be converted based on the resulting target schema.

Source-to-target (S2T) Technique:
This type of technique translates a physical schema source code directly into an equivalent target. However, as the target schema is generated using one-step mapping with no ICR for enrichment, this technique usually results in an ill-designed database as some of the data semantics are ignored. This approach could take the following forms:  Flat Technique: This technique converts each relation into object class/XML element in the target database (Wang C et al.,, 2006). FKs are mapped into references to connect objects. However, the flattened form of RDBs is preserved in the generated database, with which object-based model features and the hierarchical form of XML model are not exploited. This means that the target database is semantically weaker and of a poorer quality than the source. Moreover, creating too many references cause degraded performance during data retrieval.  Clustering Technique: This technique is performed recursively by grouping entities and relationships together starting from atomic to construct more complex entities until the desired level of abstraction is achieved (Sousa et al., 2002). A strong entity is wrapped with all of its direct weak entities, forming a complex cluster labeled with the strong entity name. This technique works well when the aim is to produce hierarchical forms with one root. This technique may reduce search time by avoiding join operations, and thus speeding up query processing, however, it may lead to complex structures and is prone to errors in translation. In addition, materializing component entities within their parent/whole entities may cause data redundancy, the loss of semantics and the breaking of relationships among objects.  Nesting Technique: This technique is uses the iterated mechanism of a nest operator to generate a nested target structure from tuples of an input relation. The target is extracted from the best possible nesting outcome. However, the technique has some limitations, e.g., mapping each table separately and ignoring integrity constraints. Besides, the process is quite expensive, as it needs all tuples of a table to be scanned repeatedly to get the best possible nesting.

SCT Technique:
This type of technique enriches a source schema by semantics that might not have been clearly expressed in it and their inter-relationships (Alajj et al., 2003). Then, the schema is translated from logical into conceptual through recovering the domain semantics (e.g., primary keys, foreign keys, cardinalities, etc.) and making them explicit. Finally, the results are represented as a conceptual schema using database reverse engineering (DBRE), which can be translated into the target effectively. In this way the technique results in a good welldesigned target database.

DATABASE MIGRATION TOOLS (DBMTS)
A number of prototypes and tools have been developed to facilitate the migration of relational databases into target databases. DBMTs presented a system, called the knowledge extraction system (KES), for generating an EER model from RDBs. KES has been developed to extract domain semantics by analyzing the RDB schema and data instances. However, various semantic constraints, schemamapping constructs and data migration techniques were not addressed adequately in this work. In later years plenty of tools are arrived for database migrations. In software industries, one of the major problems is ensuring quality database administration the tasks connected to a migration workflow is diverse and complicated. Doing all of them manually requires plenty of time and a migration team highly experienced in the source as well as the target system. As both factors are not available in most situations, migration tools may come handy and should be considered to ease the migration workload.  Types of database the DMT supports, If does not support the database from or to which users need to perform the migration.

2.
How the database transfer is configured through the DMT's interface. This configuration will determine whether the data transfer can be executed repeatedly.

3.
DMT should checks database integrity before execute a data transfer. Sometimes database migration operation fails because of database integrity violation among the data.

4.
How well the DMT incorporates customized data transfer requirement.

5.
Ensuring the correctness of a completed database migration operation. While the criteria outlined above are adequate for the complex project that developing for Database Migration Tools. The complexity of a general extract, transform, and load (ETL) system may go beyond what these criteria can evaluate. Most of the Database Migration Tools (DMTs) are not dealt with complex data like image, audio, and video files. Still the investigations are needed on dealing of complex files. In addition, some other criteria, such as data migration performance and cost, can be important to project managers and DMT developers.

CONCLUSION
In this paper, we have presented a deep study on existing database migration strategies since from 1987. And also investigation of primitive migration like legacy migration and database reverse engineering. We pointed out limitations of each strategies while the migration process. Each migration strategies have some rules to enable the process, which might be a point of disadvantages of migration process. The schema translation using S2T technique, we have note that most works for migrating to XML are following SCT techniques, focusing on generating a DTD schema and data. Due to focusing on schema rather than data, strategies either data loading or enabling on the structure of target databases and data remain stored in relational databases. Moreover, there are still shortcomings in implementations of schema translation and data conversion mechanisms. By using of middleware technologies, it may lead slow performance making the process to expensive at run-time because of dynamic mapping between source and target databases. Most of the database migration techniques generate a database that to either flat relational or has a deep level of clustering or nesting. The existing technique does not provide a solution for more than one target database or for either schema or data conversion. Besides, none of the existing techniques can be considered as a method of time and space complexity of migration between source and target databases with experimental results. In this paper, we presented some examples of migration toolkits. Database Migration Tool (DBMT) kits could prove very useful in supporting minimizing the risk, stay on budget, keep downtime to minimum, and, in case of failure of migration schema and data be able to remain in source database. Unfortunately, there is not much choice with respect to migration tools for open source databases. This might lead to the migration team having to perform most of the steps "manually". However, database migration is a common task that most database administrators need to tackle. Knowing how to evaluate and choose the right Database Migration Tool (DMT) can be vital to the fate of a software project that might directly contribute to the success of a business operation. Yet there are few guidelines in how to evaluate the usefulness and effectiveness of a general DBMT. We have concluded with illustrates five basic criteria that can serve as standards for current and future Database Migration Tools. And this review paper expose that developers will serve users well in evaluating DBMT products and for future research.