Imagine a scenario where you buy a new apartment, you’re all packed to move in, and have carefully planned the process through. You are all set to move to your new home, then suddenly you realize–there is a problem. The furniture and artifacts do not fit well with your new apartment.
How would you feel if you had to dump all your furniture and start from ground up? Taking this analogy further, consider the new apartment as your new database and the furniture as the data. We are sure that data would be much more important than furniture for your business, and hence you’d want to retain every single bit of it while you plan to migrate.
As a follow continuation of our technology migration series, in this blog, we shall try to uncover the basics of database migration(DBM), what one should understand, consider and take care of while performing the database migration.
Why and when is Database Migration needed?
As a follow up from our previous application migration blog, today’s piece centers around database migration (also known as schema migration) because it’s one of the most important migrations that deals mainly with selecting, extracting, transferring/updating the data from one database framework to another.
Essentially, the need for database migration can be specific to a business requirement, or even to satisfy the latest multiple regulatory policies instituted by various regulators. While there are no defined situations when DBM is needed, we have listed some situations where this is unavoidable:
- A common reason is to move your outdated system to a system that fulfills your business requirements and modern data needs. New and modern storage techniques have become a necessity in the times of Big Data.
- Certain regulators have made it mandatory for the data to stay in a particular geography only. Hence, necessitating that distributed data is migrated to a single location.
- Some Companies prefer to move an on-premise database to a cloud database. This typically saves on infrastructure and resources of expertise that are needed to support the data, and also make systems quicker.
- Migrating to a new platform ensures comprehensive data integrity. It reduces storage and media costs, which results in significant improvement in ROI.
- Many modern-day companies desire to keep all their data in one place. This way the data is accessible to all the departments of the company.
Migrating the database system to a new platform reduces the disruption occurring to daily business operations. This can be done with minimal manual efforts if the selection of a new database platform is done wisely.
Types of Databases
Before proceeding with database migration, let’s first have a look at the different types of databases:
- Relational Database – Relational databases are known as the workhorses of the database industry. These databases are classified by a set of tables. Tables consist of rows and columns where row contains data instance & column contains data entry for a specific category. SQL – Structured Query Language is a standard programming interface for Relational Database.
- Distributed Database – As the name suggests, the data is distributed at various sites of any organization. Communication links are used to connect sites with each other. This helps to access the distributed database easily.
- Object-oriented Database – This type of database merges attributes of relational database & object-oriented programming. Various items created in C++ & Java can be stored in relational databases, however, an object-oriented database is more suitable for them.
- NoSQL Database – NoSQL database can efficiently analyze large unstructured data stored on multiple virtual servers. In contrast, a relational database can not handle some big data performances efficiently. NoSQL databases can easily manage such cases and are used for a large set of distributed data.
- Cloud Database – Cloud Database is a virtual environment that provides the flexibility to pay per use basis. The user only needs to pay for bandwidth and storage capacity that suits their requirements.
- Graph Database – In simple terms, a graph is a collection of nodes and edges. Graph Database contains nodes that represent entities & edge describes relationships between those entities. It is a type of NoSQL database and uses graph theory to map, store & query relationships.
- Centralized Database – With this type of database, the data is stored at a single centralized location. The database essentially contains application procedures that allow users to access the DB from remote locations as well.
Approaches of Database Migration
Migrating the data from one database platform to another can be a crucial task. If the migration is planned in a LIVE Environment, the migration has to be done with utmost caution. One must choose a convenient migration time before moving forward with data migration. The live systems often experience downtime when the data is being transferred to a new database.
There are mainly two approaches to database migration:
Big Bang Data Migration:
This approach is where one chooses to migrate the complete database at once in a limited time frame. Although big bang data migration seems less complex, it requires sufficient downtime for the live website. Furthermore, with this approach, a complete rollback of the migration process might not be easy to achieve in case the migration fails at any moment.
Trickle Data Migration
With this approach, one has to split the migration process into smaller chunks or phases. Almost like an agile approach to migration, if a single-phase fails, then only that phase needs to be rolled back, and the process repeated. However, Trickle data migration is highly time-consuming and thus might increase the project cost.
Different types of Migration
Relational DB to Relational DB
This approach is the most straight forward migration. There are tons of tools available that perform this type of migration pretty efficiently, almost 100% effective.
Relational DB to Non-Relational DB & Vice-versa
This migration is more difficult in comparison to the aforementioned one. Since Relational DB & Non-Relational DB are fundamentally different, their migration may not be 100% efficient. Essentially, migrating to Non-Relational DB means sacrificing the ACID properties (atomic, consistent, isolated, and durable) which guarantee the scalability of a database.
However, there are multiple free tools available that support database migration from Relational to Non-Relational DB. Though using them is not widely recommended as these tools do not support the system’s schema structure and most seem too rigid to adapt the system requirements.
Though there some basic conversion tips that we can render for this type of migration that we can consider (for ease, let’s consider MySQL as our Relational Database and MongoDB as our Non-relational Database)
- Convert MySQL String data type to String in MongoDB. This may include char, varchar, blob, text, etc.
- Convert MySQL Numeric data type to Number in MongoDB. This may include int, float, decimal, double, etc.
- Convert MySQL Date data type to Date in MongoDB. This may include the date, year, timestamp, etc.
- Convert MySQL Bool & Boolean data type to Boolean in MongoDB.
- You can evaluate other cases in the same manner.
Migrating with a Hybrid Model
The hybrid model design integrates the two popular database models in a single framework whilst mitigating the drawbacks of each system. For example, we can always go with a hybrid approach where we can exploit a Relational DB for less demanding operations, in combination with a non-relational DB for data-intensive initiatives.
Planning migration strategy before execution can ensure a smooth migration process. That being said, we need to always have a Plan B. Assuming a worst-case scenario where there is some data loss, or the data gets corrupted while executing the migration; you must be prepared to restore the data to its original state before trying again. This is why Data Backup is a highly imperative step during the DBM (Database Migration). So, what options exist to ensure secure data backup, let’s delve in.
One of the most efficient and secure methods to protect your migration initiative is to execute a back-up to cloud storage. Essentially, when you back-up your data to cloud storage, your files are stored off-site. This eliminates local hardware vulnerabilities that may cause problems. So, why a cloud back-up?
- Cloud back-ups are affordable because you need to pay per use only.
- A Cloud back-up is secure as it provides end-to-end encryption.
- Allows for easy disaster recovery and data restoration
- Robust cloud back-up options include the entire back-up of the system image. Thus, you need not re-install the OS. Computers & servers are restored to the last functioning version.
File sharing software
Software suites like Dropbox are growing to meet the needs of their users. Dropbox maintains versions of files that can be restored when required. Similar to cloud back-up, this option is affordable and files are stored offsite. What are its advantages?
- Files can be restored to any system at any location.
- It is free and supports collaboration.
- Highly secure (cloud storage encrypts your files in transit)
- Offline Working Capabilities
A downside, however, is that restoration is not automated. You need to copy & paste the data to the file-sharing directory structure in order to save the data. The files must reside in a defined structure, or else they will not be backed up. If the files are in a shared folder or directory, you will probably require more bandwidth to back-up those files.
Generally speaking, there are other available mediums for backing-up like a back-up to a flash drive or external hard drive. But these options are not recommended as they are not entirely secure compared to cloud back-up or file-sharing software. For instance, any physical damage to a hard drive can cause data loss. Furthermore, they are more vulnerable and susceptible to ransomware attacks and crypto-viruses.
How to ensure that data is secure?
When carrying out data migration, you need to make sure that sensitive data is not breached or tampered with. If migration goes wrong, it may lead to greater consequences and result in data leaks or data loss such as the examples cited here in 2020.
Unfortunately, data leaks may cause damage to the client’s reputation, lead to loss of business and customers; or in some cases, may provoke legal actions. To avoid all such consequences, you need to draw up a security plan beforehand keeping in mind the migration strategy.
- To begin with, reliable & secure server access and data access should be on your priority list.
- Increase the number of permissions that are required for data transfer. In larger organizations, security departments limit access to the servers and define the data migration between the servers involved.
- Data is at high risk when more parties are involved. So, avoid transferring between parties via portable storage devices or emails. In such instances, the data can be easily compromised.
- To ensure secure access, storage, and retrieval of data, one must constantly practice encryption and decryption. You can use hybrid encryption algorithms to ensure maximum data security. But this is not recommended to everyone. If the migration fails, then the data would be cluttered up and might lead to data corruption or data loss.
To ensure secure migration, avoid using primitive tools. Using primitive tools might weaken your system and leave loopholes for hackers to access. You must employ robust tools for data migration that are functionally specific.
Data Migration Process
Data migration is principally a multiphase process, and following steps should be followed to avoid data loss and ensure safe database migration.
- Collect the business requirements analysis and define the key goal that one needs to achieve with DBM.
- Define the scope
- Conduct extensive Data profiling:
- Review the source data, data format, review the schema structure, content and the relationships between the data instances
- Understand the destination system
- Identify the stakeholders
- Budget the whole activity
- Data backup
- Make sure the data that you are migrating is backed up securely. Using a cloud backup is recommended.
- Ensure the destination is clean and protected from any data hacks.
- Resource availability:
- The time availability for the migration and down time for the destination system.
- Make sure the human resource hired does have the correct skillset.
- Identify the right tool and scripts.
- Data Migration Execution:
- The migration process may include scripting, ETL tools, or other comparable tools to move the data.
- At the time of migration, you will transform data, normalize the data types, and at last, check for possible errors.
- Testing & Tuning:
- The team and client team needs to be critically sure that all of the data is correctly migrated
- So, check whether the data is moved correctly, whether the data is complete and ensure that there are no missing values.
- Also, be sure that the data is valid and does not contain any null values.
- In case there is any data mismatch, there should be a data rollback and the whole process should be restarted.
Once the new database is live, a system can be set up to audit the data. This will ensure the accuracy of the database migration and draw attention to incomplete and inaccurate data.
Potential Risks with Database Migration
Database Migration is a highly intricate procedure and it comes with its risk and uncertainty. You can always overcome these by proper planning and execution. The risks that can be encountered are:
- Out of date source systems: Here, the data source can be out of date, Redundant, Obsolete or Trivial
- Disparate database architecture: In this scenario, the source databases could be located at multiple locations, and they might have totally different architectures from one another, yet the destination database needs everything to be in sync.
- Extended Downtime: There are cases where the planned migration takes longer than expected, and due to this there is an extended downtime for the system. This might cause business loss for the end client and might not be acceptable.
- Potential Data loss: Not all data losses can be identified during the testing phase. Some data loss instances might be identified eventually when the system gets enough traffic.
Migration from a real-time database: For any high transaction website, migrating the database is always difficult as there are constant data updates taking place; and migrating live & real-time data is not possible. There has to be a downtime for any migration to happen.
You can always seek guidance from third party experts on Database Migration. We are Creole Studios do specialize in Database migrations, and we would be happy to help or consult in any such endeavor. Stay strapped in for the fourth and final blog of the technology migration series.