This immersion day required each student to have their own account. If you are sharing single account with multiple students by creating a multiple IAM users, Account can hit following default service limit:
The Database Migration Services (DMS) hands-on lab provide a scenario, where participant learns to hydrate Amazon S3 data lake with a relation database. To achieve that, participants need a source endpoint and this guide helps instructors set up a PostgreSQL database with public endpoint as the source database.
In this lab, you will complete the following tasks using AWS CloudFormation template:
Relevant information about this lab:
Instructor will provide source database details to participants during main lab to configure source endpoint.
In this section, you are going to create a PostgreSQL RDS instance as data source for AWS Data Migration Service to consume by lab attendees for data migration to Amazon S3 data lake.
Make sure you select the us-east-1 (N. Virginia) region
Sign in to the Console where you will host the source database environment.
Navigate to the AWS CloudFormation page.
Click the Deploy to AWS icons below to stand up the DMS pre-lab 1 workshop infrastructure.
or perform step below 3 manually
Launch a new stack with the AWS CloudFormation template DMSLab_instructor_CFN.json provided with your lab package. Make sure to select us-east-1 (Virginia) region.
e. In Specify stack details, provide a name for Stack Name as “dmslab-instructor”.
f. Click on Next. g. In review page, review all the details, scroll down and check the box to acknowledge the policy and then click on Create Stack.
h. Launch the stack. It may take 15 minutes for the stack to launch. This stack creates a new VPC, Subnets, Security groups, EC2 instance, Route table, Routes, and an RDS Postgres instance.
Please make sure the Postgres database is fully populated before proceed with the DMS lab. It takes 15 to 20 minutes to finish, after the stack is launched.
You can see all resources listed below:
i. Go to the Outputs tabs of AWS CloudFormation stack and note down the instance Endpoint information for your RDS endpoint, which will be similar to information shown in below screenshot
Currently your RDS source end point is not open to connect to outside world for security reason. You need to open RDS security group to accept traffic from intended range of IP address. As it is difficult to determine range of IP address of workshop environment, so to have smooth experience of running lab you can temporally allow inbound traffic from all IP address (0.0.0.0/0 CIDR range).
Warning: It is not best practice to allow ALL CIDR range in your database security group. You should never apply open to all IP CIDR range while working on actual workload.
Follow below steps to open security group for students to connect with source RDS data base for DMS full data and CDC data dump:
Click VPC security groups under Connectivity & security tab as shown below:
In Security group screen, Go to Inbound tab and click on Edit as shown below
Update Inbound rule to “Anywhere” from hard coded value “188.8.131.52/32” , as shown in below screen. You can also update to your own IP address if want running both lab in same
Note: Make sure to remove “Anywhere” inbound rule from security group as soon as you are done with DMS main lab. Optionally, You can read though the documentation to better understand the source database environment. The GitHub repository for aws-database-migration-samples is located here: https://github.com/aws-samples/aws-database-migration-samples/tree/master/PostgreSQL/sampledb/v1
You can follow below instruction to setup SQL Workbench to access your Postgres Database from SQL client: https://aws.amazon.com/getting-started/tutorials/create-connect-postgresql-db/
In SQL Workbench:
Run following query to find out all Schema and table created.
SELECT * FROM pg_catalog.pg_tables;
Use following query to analyze a table
select * from schemaname.tablename;
select * from dms_sample.player;
Following sections are optional, you only need to execute if you want to show change data capture replication with DMS.
When you want to generate transactions to demonstrate DMS CDC (Change Data Capture) functionality, navigate to Lambda console and you will see a pre-built Lambda function named “GenerateCDCData”.
Click on the function and scroll down. You will see the code for this function. Copy the below query and paste it in the placeholder (value) of this code line:
“ var query_cmd= ”
Run this query first: select dms_sample.generateticketactivity(10);
This query will generate 10 ticket sales in batches of 1-6 tickets to randomly selected people for a random price (within a range.) A record of each transaction is recorded in the ticket_purchase_hist table. 3. Click on Save and then click on Test to run the function. You can create an empty event as shown here:
You will see no error in lambda log
Run this query next in the lambda function: select dms_sample.generatetransferactivity(10);
Click on Save and then click on Test to run the function.
When enabling CDC functionality in DMS, only one DMS instance/task should activate “Ongoing replication” to avoid conflicts. When replicating to multiple targets, the processing to fan out the updates should begin with the Amazon S3 bucket, that is the target of the DMS task responsible for Ongoing replication. The process should not begin with the source database, as only one CDC process should be tracking and setting the last committed transaction that was replicated.