Master AWS Lambda Functions for Data Engineers using Python
Master AWS Lambda Functions for Data Engineers using Python, Build Lambda Functions using Python, Lambda Triggers, Deploy using layers and Docker, Validate using Glue and Athena.
Course Description
Do you want to learn AWS Lambda Functions by building an end-to-end data pipeline using Python as Programming Language and other key AWS Services such as Boto3, S3, Dynamodb, ECR, Cloudwatch, Glue Catalog, Athena, etc? Here is one course using which you will learn AWS Lambda Functions by implementing an end-to-end pipeline by using all the services mentioned.
As part of this course, you will learn how to develop and deploy lambda functions using the zip files, custom docker images as well as layers. Also, you will understand how to trigger lambda functions from Eventsbridge as well as Step Functions.
- Set up required tools on Windows to develop the code for ETL Data Pipelines using Python and AWS Services. You will take care of setting up Ubuntu using wsl, Docker Desktop, and Visual Studio Code along with Remote Development Extension Kit so that you can develop Python-based applications using AWS Services.
- Setup Project or Development Environment to develop applications using Python and AWS Services on Windows and Mac.
- Getting Started with AWS by creating an account in AWS and also configuring AWS CLI as well as Review Data Sets used for the project
- Develop Core Logic to Ingest Data from source to AWS s3 using Python boto3. The application will be built using Boto3 to interact with AWS Services, Pandas for date arithmetic, and requests to get the files from the source via REST API.
- Getting Started with AWS Lambda Functions using Python 3.9 Run-time Environment
- Refactor the application, and build a zip file to deploy as AWS Lambda Function. The application logic includes capturing bookmarks as well as Job Run details in Dynamodb. You will also get an overview of Dynamodb and how to interact with Dynamodb to manage Bookmark as well as Job Run details.
- Create AWS Lambda Function using a Zip file, deploy using AWS Console and Validate.
- Troubleshoot issues related to AWS Lambda Functions using AWS Cloudwatch
- Build a custom docker image for the application and push it to AWS ECR
- Create AWS Lambda Function using the custom docker image in AWS ECR and then validate.
- Get an understanding of AWS s3 Event Notifications or s3-based triggers on Lambda Function.
- Develop another Python application to transform the data and also write the data in the form of Parquet to s3. The application will be built using Pandas by converting 10,000 records at a time to Parquet.
- Build orchestrated pipeline using AWS s3 Event Notifications between the two Lambda Functions.
- Schedule the first lambda function using AWS EventsBridge and then validate.
- Finally, create an AWS Glue Catalog table on the s3 location which has parquet files, and validate by running SQL Queries using AWS Athena.
- After going through the complete life cycle of Deploying and Scheduling Lambda Function and also validating the data by using Glue Catalog and AWS Athena, you will also understand how to use Layers for Lambda Function.
Here are the key takeaways from this training:
- Develop Python Applications and Deploy as Lambda Functions by using a Zip-based bundle as well as a custom docker image.
- Monitor and troubleshoot the issues by going through Cloudwatch logs.
- The entire application code used for the demo along with the notebook used to come up with core logic.
- Ability to build solutions using multiple AWS Services such as Boto3, S3, Dynamodb, ECR, Cloudwatch, Glue Catalog, Athena, etc