{"id":5055,"date":"2023-08-03T11:39:36","date_gmt":"2023-08-03T11:39:36","guid":{"rendered":"https:\/\/decentro.tech\/blog\/?p=5055"},"modified":"2025-11-13T11:01:16","modified_gmt":"2025-11-13T11:01:16","slug":"data-archival-aws-dms-guide","status":"publish","type":"post","link":"https:\/\/decentro.tech\/blog\/data-archival-aws-dms-guide\/","title":{"rendered":"Data Archival For Fintechs: A Step-By-Step Guide with AWS DMS"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_17 counter-hierarchy\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\">Table of Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><\/span><\/div>\n<nav><ul class=\"ez-toc-list ez-toc-list-level-1\"><li class=\"ez-toc-page-1 ez-toc-heading-level-2\"><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/decentro.tech\/blog\/data-archival-aws-dms-guide\/#Why_Data_Archival_Matters_in_the_Fintech_Space\" title=\"Why Data Archival Matters in the Fintech Space\">Why Data Archival Matters in the Fintech Space<\/a><\/li><li class=\"ez-toc-page-1 ez-toc-heading-level-2\"><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/decentro.tech\/blog\/data-archival-aws-dms-guide\/#What_is_AWS_DMS_why_choose_it_to_archive_data\" title=\"What is AWS DMS &amp; why choose it to archive data\">What is AWS DMS &amp; why choose it to archive data<\/a><\/li><li class=\"ez-toc-page-1 ez-toc-heading-level-2\"><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/decentro.tech\/blog\/data-archival-aws-dms-guide\/#Steps_to_archive_data_using_AWS_DMS\" title=\"Steps to archive data using AWS DMS\">Steps to archive data using AWS DMS<\/a><\/li><li class=\"ez-toc-page-1 ez-toc-heading-level-2\"><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/decentro.tech\/blog\/data-archival-aws-dms-guide\/#Conclusion\" title=\"Conclusion\">Conclusion<\/a><\/li><\/ul><\/nav><\/div>\n\n<figure class=\"wp-block-image size-large featured-post-img\"><img loading=\"lazy\" width=\"855\" height=\"855\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/Data-Archival-For-Fintechs-A-Step_By_Step-Guide-with-AWS-DMS.png\" alt=\"Data Archival For Fintechs A Step_By_Step Guide with AWS DMS\" class=\"wp-image-6156\"\/><\/figure>\n\n\n\n<p>What would you do with the big amounts of data that are lying in your production database &amp; using up the storage &amp; slowing down your queries?<\/p>\n\n\n\n<p>Data archival will come to the rescue.<\/p>\n\n\n\n<p>In today&#8217;s data-driven world, organisations are generating and accumulating massive amounts of data at an unprecedented rate. With the country&#8217;s digital infrastructure being extremely robust owing to the India stack, the next wave of innovation must enhance the existing solution. Suppose the presence of fintech is driven by access. In that case, the future will be rooted in prediction, whether it is data-driven decision-making for product development or creating more tailored and relevant offerings via behaviour analysis. Fintech companies must continue harnessing data&#8217;s power to predict customer needs, anticipate market trends, and make informed decisions. This data-driven evolution will undoubtedly shape the <strong><em>future of fintech in India within the $200 billion market size<\/em><\/strong>. As data volumes continue to grow, it becomes increasingly essential for businesses to implement effective data management strategies, including data archival.&nbsp;<\/p>\n\n\n\n<p>Data archival involves systematically storing and preserving data that is no longer actively used but may still hold value for compliance, historical analysis, or future reference. In this blog post, we will explore the importance of data archival and highlight best practices for implementing a successful data archival strategy.<\/p>\n\n\n\n<h2><span class=\"ez-toc-section\" id=\"Why_Data_Archival_Matters_in_the_Fintech_Space\"><\/span><strong><a href=\"https:\/\/decentro.tech\/blog\/data-archival-aws-dms-guide\/\">Why Data Archival Matters in the Fintech Space<\/a><\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1761\" height=\"576\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/Data_archival_bullets.png\" alt=\"Why Data Archival Matters in the Fintech Space\" class=\"wp-image-5082\"\/><\/figure>\n\n\n\n<p>Data archival is indispensable for fintech companies to comply with regulations, support business operations, enable data-driven decision-making, and provide superior customer experiences. It forms the foundation for regulatory compliance, risk management, and effective governance while also serving as a valuable asset for analytics, dispute resolution, and strategic planning. Emphasizing the importance of data archival ensures that fintech companies maintain a competitive edge, build trust with customers, and thrive in the rapidly evolving digital financial landscape through<\/p>\n\n\n\n<ul><li><strong>Compliance and Regulatory Requirements: <\/strong>Industries such as Banking and Finance, have strict compliance and regulatory requirements for data retention. Archiving data helps organisations meet legal obligations by preserving data for a specified period as mandated by industry-specific regulations.<\/li><\/ul>\n\n\n\n<ul><li><strong>Cost Optimization: <\/strong>Storing large volumes of data on primary storage systems can be expensive. By implementing a data archival strategy, organisations can move less frequently accessed or non-critical data to lower-cost storage tiers, freeing up valuable resources on primary storage and reducing operational costs.<\/li><\/ul>\n\n\n\n<ul><li><strong>Improved Performance: <\/strong><a href=\"https:\/\/www.archondatastore.com\/blog\/data-archiving\/\" target=\"_blank\" rel=\"noreferrer noopener\">Archiving infrequently accessed data<\/a> from primary storage can significantly improve system performance. As the historical data will be removed from the primary storage, the query performance will also be improved.<\/li><\/ul>\n\n\n\n<ul><li><strong>Data Preservation and Historical Analysis: <\/strong>Archiving data allows organisations to preserve historical records, enabling them to perform trend analysis, conduct research, support audits, and gain valuable insights from past data. It can be valuable for long-term business planning &amp; decision-making.<\/li><\/ul>\n\n\n\n<p>So up until now, you would have understood the importance of Data archival, but how should it be done effectively &amp; efficiently?<\/p>\n\n\n\n<p>Enter <a href=\"https:\/\/docs.aws.amazon.com\/dms\/index.html\" target=\"_blank\" rel=\"noreferrer noopener\">AWS Data Migration Service<\/a>!<\/p>\n\n\n\n<h2><span class=\"ez-toc-section\" id=\"What_is_AWS_DMS_why_choose_it_to_archive_data\"><\/span><strong>What is AWS DMS &amp; why choose it to archive data<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>AWS Data Migration Service is a fully managed service provided by Amazon Web Services. It enables businesses to migrate their data seamlessly and securely between various sources, including databases, data warehouses, and other AWS services.&nbsp;<\/p>\n\n\n\n<p>In the case of Decentro, the source is our production MySQL database &amp; the target is Amazon S3.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1563\" height=\"923\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/Data_archival_bullets_2.png\" alt=\"What is AWS DMS &amp; why choose it to archive data\" class=\"wp-image-5100\"\/><\/figure>\n\n\n\n<p>AWS DMS provides the following benefits:<\/p>\n\n\n\n<ul><li><strong>Seamless Integration:<\/strong> AWS DMS seamlessly integrates with various data sources, including on-premises databases, Amazon RDS, Amazon Redshift, and more. This ensures compatibility with your existing infrastructure, allowing a smooth transition from archiving data to AWS storage services.<\/li><\/ul>\n\n\n\n<ul><li><strong>Fully Managed Service:<\/strong> AWS DMS is a fully managed service that handles the complexities of data migration and archival. It eliminates the need for manual scripting or data transfer tasks, freeing up valuable time and resources for your organisation.<\/li><\/ul>\n\n\n\n<ul><li><strong>Homogeneous and Heterogeneous Data Migration:<\/strong> AWS DMS supports both homogeneous and heterogeneous data migration, allowing you to archive data between the same or different database engines. This flexibility enables organisations to migrate data from legacy systems or cloud providers to AWS storage services effortlessly.&nbsp;<\/li><\/ul>\n\n\n\n<ul><li><strong>Continuous Data Replication:<\/strong> With AWS DMS, you can achieve continuous data replication during archival. It captures changes made to the source database and replicates them to the target storage in near real-time, ensuring data integrity and minimising potential data loss.&nbsp;<\/li><\/ul>\n\n\n\n<ul><li><strong>Data Transformation and Filtering:<\/strong> AWS DMS offers data transformation capabilities, allowing you to modify data structures or formats during the archival process. You can also apply filters to select specific datasets for archiving, further optimising storage usage and reducing costs.<\/li><\/ul>\n\n\n\n<ul><li><strong>Scalability and Resilience:<\/strong> AWS DMS is designed to handle large-scale data archival projects, accommodating datasets of any size. It automatically scales resources to match the workload, ensuring optimal performance and reliability during the archival process.<\/li><\/ul>\n\n\n\n<ul><li><strong>Automation: <\/strong>AWS DMS tasks can be automated by using an AWS Lambda function which will execute the task &amp; which can further be scheduled using Amazon Eventbridge.<\/li><\/ul>\n\n\n\n<p>AWS DMS sounds perfect for handling data archival activity.&nbsp;<\/p>\n\n\n\n<p>Let\u2019s take a look at how this process will be done.<\/p>\n\n\n\n<h2><span class=\"ez-toc-section\" id=\"Steps_to_archive_data_using_AWS_DMS\"><\/span><strong>Steps to archive data using AWS DMS<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>We will use the AWS console to set up the data migration task.<\/p>\n\n\n\n<ol><li><strong>Creating a replication instance<\/strong><\/li><\/ol>\n\n\n\n<p>Log in to the AWS console, go to the Data Migration Service section, and click on the Replication instances in the Migrate data section on the left pane.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1895\" height=\"676\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image1.jpg\" alt=\"creating replication of instance\" class=\"wp-image-5056\"\/><\/figure>\n\n\n\n<p>Click on the <a href=\"https:\/\/docs.aws.amazon.com\/dms\/latest\/userguide\/CHAP_ReplicationInstance.html\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Create replication instance<\/strong><\/a> button &amp; fill in the required fields.&nbsp;<\/p>\n\n\n\n<p>Select the <strong>Instance Configuration<\/strong> &amp; <strong>AWS DMS version<\/strong> based on your production workload<strong>. <\/strong>Also, select the failover setting as <strong>Multi-AZ<\/strong> or <strong>Single-AZ<\/strong> in the <strong>High Availability<\/strong> option.&nbsp;<\/p>\n\n\n\n<p>Select how much <strong>Storage<\/strong> you want to allocate to your replication instance.<\/p>\n\n\n\n<p>Choose the <strong>VPC<\/strong> &amp; <strong>Subnet<\/strong> on which you want your replication instance to run.<\/p>\n\n\n\n<p>Choose the weekly maintenance time in the <strong>Maintenance window<\/strong> at which AWS will automatically perform an OS update, security patch update &amp; AWS DMS version update.&nbsp;<\/p>\n\n\n\n<p>You can also add<strong> Tags <\/strong>to your replication instance based on your requirements.<\/p>\n\n\n\n<p>Verify the selected configuration, And click on the <strong>Create replication instance<\/strong> button. AWS will take some time to configure the instance.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"691\" height=\"919\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image2.jpg\" alt=\"Connectivity and Security \" class=\"wp-image-5057\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"663\" height=\"894\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image3.png\" alt=\"\" class=\"wp-image-5058\"\/><\/figure>\n\n\n\n<ol start=\"2\"><li><strong>Creating a source endpoint&nbsp;<\/strong><\/li><\/ol>\n\n\n\n<p>In this step, we will create a source endpoint from which the data will be migrated.<\/p>\n\n\n\n<p>And we will be using <strong>MySQL RDS<\/strong> as our <a href=\"https:\/\/docs.aws.amazon.com\/dms\/latest\/userguide\/CHAP_Source.MySQL.html\" target=\"_blank\" rel=\"noreferrer noopener\">source endpoint<\/a>.<\/p>\n\n\n\n<p>To create endpoints, click on the <strong>Endpoints<\/strong> in the <strong>Migrate data<\/strong> section on the left pane. Then click on the <strong>Create Endpoint<\/strong> button.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1889\" height=\"592\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image4.jpg\" alt=\"creating a source end point\" class=\"wp-image-5059\"\/><\/figure>\n\n\n\n<p>For creating a source endpoint, select the <strong>Source endpoint<\/strong> radio button &amp; select the <strong>RDS DB instance<\/strong> checkbox &amp; select the RDS instance from the dropdown.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"748\" height=\"453\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image5.jpg\" alt=\"create end point\" class=\"wp-image-5060\"\/><\/figure>\n\n\n\n<p>In the <strong>Endpoint configuration<\/strong> tab, add an <strong>identifier<\/strong> for the endpoint &amp; select the <strong>source engine<\/strong>. In this example, we have chosen <strong>MySQL<\/strong>.<\/p>\n\n\n\n<p>To access the database, you can use <strong>AWS Secrets Manage<\/strong>r or <strong>manually<\/strong> provide the required credentials. In this example, we have manually provided the credentials.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"752\" height=\"892\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image6.jpg\" alt=\"end point configuration\" class=\"wp-image-5061\"\/><\/figure>\n\n\n\n<p>After adding the credentials, it is always advisable to test the connection, so select the <strong>VPC<\/strong> &amp; the <strong>replication instance<\/strong> you created in the previous step &amp; run the test. After receiving the successful status, click on <strong>Create Endpoint<\/strong> button to create a source endpoint.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"750\" height=\"676\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image7.jpg\" alt=\"VPC and replication instance\" class=\"wp-image-5062\"\/><\/figure>\n\n\n\n<ol start=\"3\"><li><strong>Creating a target endpoint<\/strong><\/li><\/ol>\n\n\n\n<p>Similar to the source endpoint, we will be creating a target endpoint, to which the data will be stored after migration. In the <strong>Endpoints<\/strong> dashboard, click on the <strong>Create Endpoin<\/strong>t button.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1882\" height=\"573\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image8.jpg\" alt=\"creating target end point\" class=\"wp-image-5063\"\/><\/figure>\n\n\n\n<p>For creating a target endpoint, select the <strong>Target endpoint<\/strong> radio button.<\/p>\n\n\n\n<p>In the <strong>Endpoint configuration<\/strong> tab, add an <strong>identifier<\/strong> for the endpoint &amp; select the <strong>source engine<\/strong>. In this example, we have chosen <strong>Amazon S3<\/strong>. If we select S3, the migrated data will be stored in CSV file format by default. It can also be changed to Parquet file format based on your requirement.<\/p>\n\n\n\n<p>Add the <strong>ARN of the service role<\/strong> to access the S3 bucket. And add the <strong>Bucket name<\/strong> &amp; the <strong>Bucket folder <\/strong>into which the data should be stored.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"676\" height=\"907\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image9_1-1.png\" alt=\"\" class=\"wp-image-5101\"\/><\/figure>\n\n\n\n<p>In the <strong>Endpoint settings<\/strong> section, you can configure <a href=\"https:\/\/docs.aws.amazon.com\/dms\/latest\/userguide\/CHAP_Target.S3.html#CHAP_Target.S3.Configuring\">additional settings<\/a> that you would like to have in the migrated data. Here, we are using the Wizard to add settings in a key-value format.<\/p>\n\n\n\n<p>Based on AWS\u2019s recommendations, you can also use <strong>Extra Connection Attributes<\/strong> &amp; they should be added semicolon-separated.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"759\" height=\"853\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image10.png\" alt=\"endpoint settings\" class=\"wp-image-5065\"\/><\/figure>\n\n\n\n<p>After all the settings are added, you can click on the <strong>Save<\/strong> button to create a target endpoint.<\/p>\n\n\n\n<ol start=\"4\"><li><strong>Creating a database migration task<\/strong><\/li><\/ol>\n\n\n\n<p>The last step is to create a database migration task which will actually migrate the data from the source to the target endpoint.<\/p>\n\n\n\n<p>To create a DMS task, click on the <strong>Database migration tasks<\/strong> in the <strong>Migrate data <\/strong>section on the left pane. Then click on the <strong>Create task <\/strong>button.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1894\" height=\"575\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image11.jpg\" alt=\"Creating database migration task\" class=\"wp-image-5066\"\/><\/figure>\n\n\n\n<p>In the <strong>Task configuration <\/strong>section, add the <strong>Task Identifier<\/strong> &amp; select the <strong>Replication instance<\/strong>, <strong>source<\/strong> &amp; <strong>target<\/strong> endpoints we created before.<\/p>\n\n\n\n<p>In the <strong>Migration type<\/strong>, you can select from the following three options based on your requirement:<\/p>\n\n\n\n<ul><li><strong>Migrate existing data<\/strong> <strong>(Full load Only)<\/strong> \u2013 Perform a one-time migration from the source endpoint to the target endpoint.<\/li><\/ul>\n\n\n\n<ul><li><strong>Migrate existing data and replicate ongoing changes (Full load and CDC)<\/strong> \u2013 Perform a one-time migration from the source to the target, and then continue replicating data changes from the source to the target.<\/li><\/ul>\n\n\n\n<ul><li><strong>Replicate data changes only (CDC only<\/strong>) \u2013 Don&#8217;t perform a one-time migration, but continue replicating data changes from the source to the target.<\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"799\" height=\"623\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image12.jpg\" alt=\"Creating database migration task 2\" class=\"wp-image-5067\"\/><\/figure>\n\n\n\n<p>In the <strong>Task Settings <\/strong>section,&nbsp;<\/p>\n\n\n\n<p>Firstly, select the <strong>Target table preparation mode<\/strong> from the following based on your requirement:<\/p>\n\n\n\n<ul><li><strong>Do nothing<\/strong> \u2013 if the tables already exist at the target, they remain unaffected. Otherwise, AWS DMS creates new tables.<\/li><\/ul>\n\n\n\n<ul><li><strong>Drop tables on target<\/strong> \u2013 AWS DMS drops the tables and creates new tables in their place.<\/li><\/ul>\n\n\n\n<ul><li><strong>Truncate<\/strong> &#8211; AWS DMS leaves the tables and their metadata in place but removes the data from them.<\/li><\/ul>\n\n\n\n<p><strong>Note:<\/strong> <strong>Do Nothing<\/strong> target preparation mode creates a table if the target table doesn&#8217;t exist.<\/p>\n\n\n\n<p>In the <strong>LOB columns setting<\/strong>s, there are the following options:<\/p>\n\n\n\n<ul><li><strong>Don&#8217;t include LOB columns:<\/strong> AWS DMS ignores columns or fields that contain large objects (LOBs).<\/li><\/ul>\n\n\n\n<ul><li><strong>Limited LOB mode:<\/strong> AWS DMS truncates each LOB to the size defined by \u201cMax LOB size\u201d. (Limited LOB mode is faster than full LOB mode.)<\/li><\/ul>\n\n\n\n<ul><li><strong>Full Lob mode:<\/strong> AWS DMS fetches the LOB data in chunks as specified in &#8220;Lob Chunk Size&#8221;.<\/li><\/ul>\n\n\n\n<p>If you select&nbsp; <strong>Limited LOB Mode<\/strong> for the above, define a <strong>Maximum LOB size<\/strong> in <strong>KBs<\/strong>.<\/p>\n\n\n\n<p>You can <strong>Turn on Cloudwatch logs<\/strong> to log all the activities performed by AWS DMS, which will help debug issues if there are any.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"799\" height=\"725\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image13.png\" alt=\"Task settings\" class=\"wp-image-5068\"\/><\/figure>\n\n\n\n<p>In the <strong>Table mappings<\/strong> section, you can define <strong>Selection rules<\/strong> &amp; <strong>Transformation rules.&nbsp;<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"742\" height=\"327\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image14.png\" alt=\"Table mappings\" class=\"wp-image-5069\"\/><\/figure>\n\n\n\n<p>In the <a href=\"https:\/\/docs.aws.amazon.com\/dms\/latest\/userguide\/CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Selections.html\"><strong>selection rules<\/strong><\/a> tab, you can add the rules which will be applied during the migration process. You must enter the database schema name in the <strong>Source name<\/strong> field &amp; table name in the <strong>Source table name<\/strong> field. And in the <strong>Action<\/strong> field, you can select either <strong>Include<\/strong> or <strong>Exclude,<\/strong> which as the name suggests, will include or exclude the table from the migration respectively.<\/p>\n\n\n\n<p>We have added <strong><em>General_Database<\/em><\/strong> as our source schema &amp; <strong><em>General_Api_Log_Table<\/em><\/strong> as our source table. And we have added the Action as <strong>Include<\/strong>, meaning the DMS will only include the <strong><em>General_Api_Log_Table <\/em><\/strong>table during the migration.<\/p>\n\n\n\n<p>Furthermore, you can add <strong>source filters<\/strong> on columns of a table.&nbsp;<\/p>\n\n\n\n<p>Here, we have added a filter on the <strong><em>request_timestamp<\/em><\/strong> column &amp; the condition is <strong>Less than or equal t<\/strong>o, which means only the rows with <strong><em>request_timestamp &lt;= 2022-09-22<\/em><\/strong> will be migrated.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"697\" height=\"715\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image15.jpg\" alt=\"Selection rules\" class=\"wp-image-5070\"\/><\/figure>\n\n\n\n<p>After creating <strong>at least one selection rule<\/strong>, you can add <a href=\"https:\/\/docs.aws.amazon.com\/dms\/latest\/userguide\/CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Transformations.html\" target=\"_blank\" rel=\"noreferrer noopener\">transformation rules<\/a> to the task so the data selected based on the selection filter will be transformed according to the rules provided.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"775\" height=\"590\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image16.jpg\" alt=\"Transformation rules\" class=\"wp-image-5071\"\/><\/figure>\n\n\n\n<p>You can also enable <a href=\"https:\/\/docs.aws.amazon.com\/dms\/latest\/userguide\/CHAP_Tasks.AssessmentReport.html\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Premigration <\/strong><\/a>assessments, which will run before the DMS starts the migration. They are helpful in finding any schema or data-related issues.<\/p>\n\n\n\n<p>You can also add <strong>Tags<\/strong> if you want to map anything with the DMS task.<\/p>\n\n\n\n<p>As we have selected the premigration assessment, we will manually run the DMS task after the evaluations are executed successfully.<\/p>\n\n\n\n<p>After all the above configurations are done, click the <strong>Create Task<\/strong> button. It will take some time to create the task &amp; you will receive a notification on the AWS console after it is created successfully.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"595\" height=\"905\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image17.jpg\" alt=\"Pre-migration assessment\" class=\"wp-image-5072\"\/><\/figure>\n\n\n\n<p>After the task is created successfully, it will be visible on the dashboard.&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1851\" height=\"711\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image18.jpg\" alt=\"dms archival\" class=\"wp-image-5073\"\/><\/figure>\n\n\n\n<ol start=\"5\"><li><strong>Executing a database migration task<\/strong><\/li><\/ol>\n\n\n\n<p>Firstly, after the task is created, you should run the premigration assessments &amp; the following result will be displayed upon successful completion:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1850\" height=\"566\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image19.jpg\" alt=\"executing database migration\" class=\"wp-image-5074\"\/><\/figure>\n\n\n\n<p>This step is required because we selected the [<strong>Manually later<\/strong>]option while creating the DMS task. If the [<strong>Automatically on Create]<\/strong> option were selected the task would have been started automatically after it was made.&nbsp;<\/p>\n\n\n\n<p>So to execute the DMS task, click the <strong>Actions<\/strong> dropdown and select the <strong>Restart\/ Resume<\/strong> option.<\/p>\n\n\n\n<p>Upon clicking the above button, you will receive a Warning message, so if you have any data in the s3 bucket, you will have to move it; otherwise, that data will be overwritten. After careful verification, you can click on the <strong>Restart\/Resume<\/strong> button.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"602\" height=\"203\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image20.png\" alt=\"warning message \" class=\"wp-image-5075\"\/><\/figure>\n\n\n\n<p>You will be able to see the status as <strong>Starting<\/strong>. After some time, that status will be updated to <strong>Running<\/strong>.&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1565\" height=\"183\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image21.jpg\" alt=\"dms archival 2\" class=\"wp-image-5076\"\/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1423\" height=\"186\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image22.jpg\" alt=\"result of dms archival\" class=\"wp-image-5077\"\/><\/figure>\n\n\n\n<p>After some time, the DMS archival task will be completed &amp; the status will be updated to <strong>Load complete <\/strong>&amp; you will be able to see the number of rows it has archived. Here, it shows <strong><em>441,477<\/em><\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1582\" height=\"581\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image23.jpg\" alt=\"Load \" class=\"wp-image-5078\"\/><\/figure>\n\n\n\n<p>As we selected the target endpoint as an Amazon S3 bucket, we will be able to see the archived data in CSV format, which ends the data archival process.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1794\" height=\"439\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image24.jpg\" alt=\"database archives \" class=\"wp-image-5079\"\/><\/figure>\n\n\n\n<p><strong>Next steps&nbsp;<\/strong><\/p>\n\n\n\n<p>This archival process can be automated using <a href=\"https:\/\/aws.amazon.com\/lambda\/\">AWS Lambda<\/a> functions which can update the mapping rules for the tables &amp; <a href=\"https:\/\/aws.amazon.com\/eventbridge\/\">Amazon Eventbridge<\/a>, which can run the Lambda functions at regular intervals based on the rules that you provide.&nbsp;<\/p>\n\n\n\n<p>Suppose you want to query the archived data. In that case, you can use <a href=\"https:\/\/aws.amazon.com\/glue\/\">AWS Glue<\/a>, which will automatically create a database &amp; the corresponding table schemas based on the CSV files. It can be further integrated with <a href=\"https:\/\/aws.amazon.com\/athena\/\">Amazon Athena<\/a>, giving you a platform to write SQL queries on top of it.<\/p>\n\n\n\n<p>The diagram below shows the workflow we use at Decentro for data archival.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"837\" height=\"604\" src=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/image25.png\" alt=\"Decentro's workflow for DMS data archival\" class=\"wp-image-5080\"\/><\/figure>\n\n\n\n<ul><li>Consuming this, we can archive the data from our tables automatically &amp; this process runs every week.&nbsp;<\/li><li>It has helped in cleaning the historical data from our production database, which was lying there for a long &amp; was consuming space which was not required.&nbsp;<\/li><li>It has helped us to save our storage costs &amp; there has been a significant reduction in query times for our dashboard &amp; analytics.<\/li><li>On top of that, we have integrated Decentro\u2019s internal BI and data visualisation service with Amazon Athena so that our team can query the historical data if needed.<\/li><\/ul>\n\n\n\n<h2><span class=\"ez-toc-section\" id=\"Conclusion\"><\/span><strong>Conclusion<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p>Efficient data archival is essential for organisations to optimise storage costs, meet regulatory compliance requirements, and preserve valuable historical information.&nbsp;<\/p>\n\n\n\n<p>AWS Data Migration Service (DMS) simplifies the process of archiving data to AWS storage services, offering seamless integration, continuous data replication, data transformation capabilities, and scalability. By leveraging AWS DMS for data archival, organisations can achieve cost-effective, secure, and scalable storage solutions while ensuring the accessibility and integrity of their archived data. A use case we were able to witness first-hand at Decentro.&nbsp;<\/p>\n\n\n\n<p>We have also collated an exhaustive set of <a href=\"https:\/\/decentro.tech\/blog\/engineering-and-apis\/\" target=\"_blank\" rel=\"noreferrer noopener\">blogs<\/a> on the tools we have used in the past for an effective development lifecycle. Feel free to check it out. We have previously covered topics like <a href=\"https:\/\/decentro.tech\/blog\/jspdf\/\" target=\"_blank\" rel=\"noreferrer noopener\">JsPDF<\/a>, <a href=\"https:\/\/decentro.tech\/blog\/locust-load-testing\/\" target=\"_blank\" rel=\"noreferrer noopener\">Locust<\/a>, and much more.&nbsp;<\/p>\n\n\n\n<p><a class=\"decentro-homepage-signup\" href=\"https:\/\/decentro.tech\/signup?\" target=\"_blank\" rel=\"noreferrer noopener\">Let&#8217;s Connect<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>What would you do with the big amounts of data that are lying in your production database &#038; using up the storage &#038; slowing down your queries? Data archival will come to the rescue.<\/p>\n","protected":false},"author":22,"featured_media":5083,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[23],"tags":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v15.7 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Data Archival For Fintechs: A Step-By-Step Guide with AWS DMS - Decentro<\/title>\n<meta name=\"description\" content=\"Optimize storage &amp; boost performance of your Fintech product with AWS DMS. Learn step-by-step data archival, meet compliance, &amp; cut costs. Read more on our in-depth blog.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/decentro.tech\/blog\/data-archival-aws-dms-guide\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Data Archival For Fintechs: A Step-By-Step Guide with AWS DMS - Decentro\" \/>\n<meta property=\"og:description\" content=\"Optimize storage &amp; boost performance of your Fintech product with AWS DMS. Learn step-by-step data archival, meet compliance, &amp; cut costs. Read more on our in-depth blog.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/decentro.tech\/blog\/data-archival-aws-dms-guide\/\" \/>\n<meta property=\"og:site_name\" content=\"Decentro\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/decentrotech\/\" \/>\n<meta property=\"article:published_time\" content=\"2023-08-03T11:39:36+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-11-13T11:01:16+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/Data_archival_title.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1401\" \/>\n\t<meta property=\"og:image:height\" content=\"702\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DecentroTech\" \/>\n<meta name=\"twitter:site\" content=\"@DecentroTech\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\">\n\t<meta name=\"twitter:data1\" content=\"17 minutes\">\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebSite\",\"@id\":\"https:\/\/decentro.tech\/blog\/#website\",\"url\":\"https:\/\/decentro.tech\/blog\/\",\"name\":\"Decentro\",\"description\":\"API platform for banking integrations\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":\"https:\/\/decentro.tech\/blog\/?s={search_term_string}\",\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/decentro.tech\/blog\/data-archival-aws-dms-guide\/#primaryimage\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/decentro.tech\/blog\/wp-content\/uploads\/Data_archival_title.png\",\"width\":1401,\"height\":702},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/decentro.tech\/blog\/data-archival-aws-dms-guide\/#webpage\",\"url\":\"https:\/\/decentro.tech\/blog\/data-archival-aws-dms-guide\/\",\"name\":\"Data Archival For Fintechs: A Step-By-Step Guide with AWS DMS - Decentro\",\"isPartOf\":{\"@id\":\"https:\/\/decentro.tech\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/decentro.tech\/blog\/data-archival-aws-dms-guide\/#primaryimage\"},\"datePublished\":\"2023-08-03T11:39:36+00:00\",\"dateModified\":\"2025-11-13T11:01:16+00:00\",\"author\":{\"@id\":\"https:\/\/decentro.tech\/blog\/#\/schema\/person\/7fc11cb1c0ca61c5aade838eab293838\"},\"description\":\"Optimize storage & boost performance of your Fintech product with AWS DMS. Learn step-by-step data archival, meet compliance, & cut costs. Read more on our in-depth blog.\",\"breadcrumb\":{\"@id\":\"https:\/\/decentro.tech\/blog\/data-archival-aws-dms-guide\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/decentro.tech\/blog\/data-archival-aws-dms-guide\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/decentro.tech\/blog\/data-archival-aws-dms-guide\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"item\":{\"@type\":\"WebPage\",\"@id\":\"https:\/\/decentro.tech\/blog\/\",\"url\":\"https:\/\/decentro.tech\/blog\/\",\"name\":\"Blog\"}},{\"@type\":\"ListItem\",\"position\":2,\"item\":{\"@type\":\"WebPage\",\"@id\":\"https:\/\/decentro.tech\/blog\/engineering-and-apis\/\",\"url\":\"https:\/\/decentro.tech\/blog\/engineering-and-apis\/\",\"name\":\"Engineering &amp; APIs\"}},{\"@type\":\"ListItem\",\"position\":3,\"item\":{\"@type\":\"WebPage\",\"@id\":\"https:\/\/decentro.tech\/blog\/data-archival-aws-dms-guide\/\",\"url\":\"https:\/\/decentro.tech\/blog\/data-archival-aws-dms-guide\/\",\"name\":\"Data Archival For Fintechs: A Step-By-Step Guide with AWS DMS\"}}]},{\"@type\":\"Person\",\"@id\":\"https:\/\/decentro.tech\/blog\/#\/schema\/person\/7fc11cb1c0ca61c5aade838eab293838\",\"name\":\"Vivek Sheth\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/decentro.tech\/blog\/#personlogo\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/66d2aded4bdc8710633625505551e0d4?s=96&d=mm&r=g\",\"caption\":\"Vivek Sheth\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","_links":{"self":[{"href":"https:\/\/decentro.tech\/blog\/wp-json\/wp\/v2\/posts\/5055"}],"collection":[{"href":"https:\/\/decentro.tech\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/decentro.tech\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/decentro.tech\/blog\/wp-json\/wp\/v2\/users\/22"}],"replies":[{"embeddable":true,"href":"https:\/\/decentro.tech\/blog\/wp-json\/wp\/v2\/comments?post=5055"}],"version-history":[{"count":8,"href":"https:\/\/decentro.tech\/blog\/wp-json\/wp\/v2\/posts\/5055\/revisions"}],"predecessor-version":[{"id":9160,"href":"https:\/\/decentro.tech\/blog\/wp-json\/wp\/v2\/posts\/5055\/revisions\/9160"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/decentro.tech\/blog\/wp-json\/wp\/v2\/media\/5083"}],"wp:attachment":[{"href":"https:\/\/decentro.tech\/blog\/wp-json\/wp\/v2\/media?parent=5055"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/decentro.tech\/blog\/wp-json\/wp\/v2\/categories?post=5055"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/decentro.tech\/blog\/wp-json\/wp\/v2\/tags?post=5055"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}