This is a no-nonsense guide to help you get prepared for the AWS Certified DevOps Engineer Professional exam. I try to keep it short and on point so you don’t have to waste your time and helps you focus on the right services which are covered in the exam.
As you might know, taking exams requires patience and focus.
Even when you got enough experience and knowledge you can still easily fail the exam If you don’t focus and read the questions thoroughly.
There are a couple of tactics that you can apply for the exam before you even dive into the content.
So let’s get started! Here are the detailed steps to help you pass the AWS Certified DevOps Engineer Professional (DOP-C02) exam.
Table of Contents
Who should take the AWS Certified DevOps Engineer – Professional (DOP-C02) exam?
According to AWS, they recommend you have the following prerequisites prior to taking the exam:
- Two or more years of experience provisioning, operating, and managing AWS environments
- Good working knowledge of AWS core services
- Experience working with a programming or scripting language e.g. Bash & Python
- Familiarity with Linux or Windows operating systems
- Familiarity with the AWS CLI
A step-by-step guide on how to prepare for the AWS DevOps Engineer – Professional exam
Exam overview
This is what you can expect when you schedule the AWS Certified DevOps Engineer Professional exam:
- Consists of 75 multiple-choice, multiple-answer questions.
- The exam needs to be completed within 180 min (Note: follow this advice to permanently receive 30 minutes extra time for your AWS exams)
- Costs $300,-
- The minimum passing score is 750 points
- The exam is available in English, Japanese, Korean, and Simplified Chinese.
Content outline
The content outline of the AWS Certified DevOps Engineer Professional exam consists of 6 separate domains, each with its own weighting.
The table below lists the domains with their weightings:
Domain | % of exam |
---|---|
Domain 1: SDLC Automation | 22% |
Domain 2: Configuration Management and IaC | 17% |
Domain 3: Resilient Cloud Solutions | 15% |
Domain 4: Monitoring and Logging | 15% |
Domain 5: Incident and Event Response | 14% |
Domain 6: Security and Compliance | 17% |
Total | 100% |
(DOP-C02) content outline table
In the official exam guide you’ll find the nitty gritty details of what you’re expected to know within each domain.
Technical Preparation notes
In this section, I’ve bundled up my notes which you can use when you’re preparing for the AWS Certified DevOps Engineer Professional exam.
Prior to this Blogpost, I released a guide to help you prepare for the AWS Certified Developer Associate exam.
This exam, next to the AWS Certified Sysops Associate is prerequisite for this exam, so I’d highly suggest you give that guide a thorough read.
Moving on to the preparation, I’ve written technical notes which highlight all the important details that are worth remembering for the exam. I’ve categorized it into the domain sections as it’s displayed in the content outline.
This makes learning easier because if you do a practice exam from AWS they’ll share the results of how you scored on each domain. Then you can see which domains require more attention when studying for the exam.
After reading this exam guide I would definitely recommend watching the Exam Readiness: AWS Certified DevOps Engineer – Professional video from the AWS Training portal. Most of the tips and guidelines I wrote down came from that training.
In addition to that, I’ve added more context and also written down some key tips that are worth knowing for the exam.
Domain 1: SDLC Automation
AWS CodeBuild
- A fully managed build service: Build your application from sources like AWS CodeCommit, S3, Bitbucket, and GitHub
- Build and test code: Debugging locally with an AWS CodeBuild agent is possible
- To configure build steps you create a
buildspec.yml
in the source code of your repository.
This is what a typical AWS CodeBuild buildspec.yml
looks like:
AWS CodeDeploy
- Minimizes downtime because of a controlled deployment strategy
- Centralized control
- Iteratively release new features
- Three types of deployments: In-place, rolling, blue-green deployments
- Three sorts of deployment configurations: OneAtATime, HalfAtATime, AllAtOnce
- Ability to install CodeDeploy agents on EC2 instances and on-prem services to do deploys
- To specify what commands you want to run for each phase of the deployment you use an AppSpec configuration file.
This is what a typical AWS CodeDeploy appspec.yml
looks like:
AWS CodePipeline
- Is a standardized solution that adds consistency starting from source code to build/test and then deploy in one flow
- Gives you the ability to add a manual approval step
- Pipeline actions look like this:
- Source: CodeCommit, S3, GitHub
- Build & Test: CodeBuild, Jenkins, TeamCity
- Deploy: AWS CodeDeploy / AWS CloudFormation / AWS Elastic beanstalk / AWS OpsWorks
- Invoke: Specify a custom function to invoke e.g. AWS Lambda
- Approval: Publish the SNS topic for manual approval
Deployment strategies
For services like AWS CodeDeploy, CloudFormation, AWS Beanstalk, and AWS OpsWorks you can apply several deployment strategies. Each has its pros and cons.
The cheat sheet below shows the types of deployments and shows how well they rank on these columns: impact, deployment time, zero downtime, rollback process, and deploy target.
Domain 2: Configuration Management and IaC
For the second domain, it’s important to know the following for the AWS Certified DevOps Engineer exam:
- Know the functions of AWS CloudFormation in depth
- Know when and how to use AWS CloudFormation, AWS Elastic Beanstalk, and AWS OpsWorks
- Understand how to deliver Docker container images into Amazon ECS using CI/CD pipelines
- When routing “portions of users” to the application, always choose Route53
- If there is a question related to compliance or configuration management of AWS resources, the answer is most likely AWS Config
AWS CloudFormation
- Infrastructure as code, and templates are in Yaml or JSON format
- Version control/replicate/update templates like code
- Integrated with CI/CD tools
- Run automated testing for CI/CD environments
AWS CloudFormation template anatomy:
AWSTemplateFormatVersion: "version date"
Description:
String
Metadata:
template metadata
Parameters:
set of parameters
Rules:
set of rules
Mappings:
set of mappings
Conditions:
set of conditions
Transform:
set of transforms
Resources:
set of resources
Outputs:
set of outputs
Here’s an overview of the different types of CloudFormation stack updates:
Type of CloudFormation stack update | Description |
---|---|
Update with no interruption | No disruption in operation and without changing the physical name. |
Update with some interruption | Some disruption without changing the physical name |
Replacement | The resource is recreated and a new physical ID is generated |
Here’s an overview of the available AWS Cloudformation helper scripts:
Helper scripts | Description |
---|---|
cfn-init | Executes cfn metadata one time, typically in user data |
cfn-hub | Monitors cfn metadata and applies changes when discovered |
cfn-signal | Provides completion signal of a CreationPolicy or WaitCondition |
cfn-get-metadata | View the metadata that is stored in a CloudFormation stack |
Here’s an overview of the AWS CloudFormation Template resource attributes:
CloudFormation resource attribute | Description |
---|---|
CreationPolicy attribute | Define a period of time during which AWS CloudFormation will wait for a signal before marking the specific resource as Create Completed. Useful when you want the resource to finish its configuring before proceeding to deploy the next resource e.g. software installation on an EC2 instance. |
DeletionPolicy attribute | Preserve a backup of a resource when its stack is deleted, you can specify the options retain or snapshot. By default, there is no deletion policy enabled. |
DependsOn attribute | Create an explicit dependency that requires a specified resource to be created before another can begin. |
Metadata attribute | Associate structured data with a resource. |
UpdatePolicy attribute | Define how CloudFormation updates the AWS::AutoScaling::AutoScalingGroup resource. |
Domain 3: Resilient Cloud Solutions
You should know the following for the exam:
- Know when to use Multi-AZ vs Multi-region architectures
- Know how to implement HA, scalability, and fault tolerance
- Know the right services based on business requirements e.g. RTO, RPO, and costs
- Know how to design and automate disaster recovery strategies
- Evaluate a deployment for points of failure
Amazon RDS – High availability
- Cross-region snapshot copies are good for high availability and failover scenarios
- Read replicas are important for quicker cross-region failover scenarios
- Asynchronous communications
- Direct queries to the read replicas
- Use Elasticache in front of RDS
- Cache common requests in Elasticache to offload your RDS
DynamoDB – High availability
- Global tables are important to store data across multiple regions
- Reduce response times of eventually consistent read workloads
- For read-heavy or bursty workloads
Disaster recovery
Understand the concepts of Recovery Point Objective (RPO) and Recovery Time Objective (RTO):
- RTO: How much data can you afford to lose e.g. the business can recover from losing the last 8 hours of data?
- RPO: How quickly must you recover from downtime e.g. the application can be unavailable for 4 hours per month?
There are 4 types of disaster recovery:
- Backup and restore: This is the cheapest method but takes a long time to restore from disaster recovery.
- Pilot light: You replicate part of your infrastructure e.g. VPC and autoscaling groups. Once a disaster happens you scale up the computing resources.
- Warm standby: A scaled-down version of your infrastructure is replicated in the disaster recovery region. You only need to scale up the resources and update the domain to point to the disaster recovery region. This is good if you need RPO and RTO within minutes.
- Hot standby: Same as the warm standby but except that the disaster recovery infrastructure is replicated one-on-one with the original environment. This is the most expensive disaster recovery method, but it minimizes your RPO and RTO to seconds instead of minutes.
Domain 4: Monitoring and logging
For the third domain, it’s important to know the following for the AWS Certified DevOps Engineer Professional exam:
- Determine how to set up the aggregation, storage, and analysis of logs and metrics
- Apply concepts required to automate monitoring and event management of an environment
- Apply concepts required to audit, log, and monitor operating systems, infrastructures, and applications
- Determine how to implement tagging to categorize resources and get better insights into the costs
- Know the different logging options and see which is most cost-effective based on requirements
Amazon CloudWatch
- Collect metrics and logs
- Monitor: alarms and dashboards
- Act: auto-scaling and events
- Analyze: trends and metrics
- Compliance and security
Important CloudWatch metrics:
- Metrics are kept for 15 months, the older the data the less granular they become. After 2 weeks (5 min intervals), after months it will be hourly intervals, etc.
- Know these ELB metrics:
- SurgeQueueLength: Backend systems aren’t able to keep up with the ELB requests
- SpillOverCount: When the above happens, the requests get dropped, hence the SpillOverCount.
- Know these EC2 metrics:
- StatusCheckFailed: Reports whether the instance has passed both the instance status check and the system status check at the last minute.
- CPUCreditUsage: The number of CPU credits spent by the instance for CPU utilization. One CPU credit equals one vCPU running at 100% utilization for one minute or an equivalent combination of vCPUs, utilization, and time (for example, one vCPU running at 50% utilization for two minutes or two vCPUs running at 25% utilization for two minutes).
- CPUCreditBalance: The number of earned CPU credits that an instance has accrued since it was launched or started. For T2 Standard, the CPUCreditBalance also includes the number of launch credits that have been accrued.
- Know these HTTP status code metrics:
- HTTPCode_Backend_5xx: Instances or databases might be at capacity, check their metrics to verify
- HTTPCode_ELB_4xx: Check instance logs, connections are timing out
- Check latency metrics:
- If latency increases during load testing your application might not scale horizontally e.g. nog auto-scaling or DB is bottlenecked or calls to external services are slow.
AWS CloudWatch Logs
- Cloudwatch agent installed on an instance or container
- These instances log events in a log stream
- The log streams are bundled up in a log group
Monitoring with Amazon Kinesis:
- Collect process and analyze real-time streaming data (for quick incident response)
- Logs are ingested via Kinesis data streams or Firehose
- The analysis is done with Kinesis Data Analytics
- Use Kinesis Firehose if you need a fully managed service to transfer data to S3, Redshift, and Elasticsearch or Splunk.
- Use Kinesis data streams if real-time logs are needed. However, it required more effort to set up and manage.
- A good use case to implement Kinesis Firehose is if you centralize CloudWatch log events and move the data to S3 for longer retention and storage.
AWS CloudTrail
- Track user activity and API usage
- Log, continuously monitor, and retain account activity related to actions
AWS CloudTrail best practices:
- Enable in CloudTrail all regions
- Enable log file validation
- Encrypt logs
- Integrate with CloudWatch logs
- Centralize logs from all accounts
- Create additional trails as needed
- Understand how to enable log integrity.
- When you see a question about auditing user actions in AWS or reporting on an API call, the answer most likely contains AWS CloudTrail.
Domain 5: Incident and Event Response
For this domain, you should know how to troubleshoot issues and know how to restore operations.
It’s also important to know how to automate healing and be able to set up event-driven automated actions including alerting.
Logging strategy
- Use the CloudWatch logs agent for EC2/ECs to push logs to CloudWatch Logs
- Centralize logging in a separate account, and use Kinesis firehose to move multiple streams of logging to S3 for example.
- Log as much as you can even if you don’t immediately use it
- Keep logs as long as you can and use them for long-term analysis
AWS Elastic Beanstalk – Configuration management
- Cloudformation supports beanstalk
- Good for devs who want to set up applications without provisioning
- Supports:
- Tomcat for Java
- Apache for PHP or Python apps
- Nginx/Apache for node.js
- Passenger for ruby apps
- Controlled with AWS Elastic Beanstalk:
- access CloudWatch
- adjust application server settings e.g. JVM and pass environment variables
- Multi-AZ support/ no multi-region
- Restrict IPs on security groups and ACLs
- By default publicly available
- AWS Elastic Beanstalk supports IAM, VPC, and code is stored in S3
- Multiple environments are allowed
- Only changes from git repositories are replicated
Auto-scaling – Configuration management
- Make sure you know how lifecycle hooks work when scaling in and out.
- Instances can be put in a wait state in a lifecycle hook operation. The maximum amount of time to put an instance in a wait state is 48 hours (default is 1 hour).
- There are 7 termination policies:
- Default
- AllocationStrategy
- OldestLaunchTemplate
- OldestLaunchConfiguration
- ClosestToNextInstanceHour
- NewestInstance
- OldestInstance
- Use a CreationPolicy attribute to EC2 instances in an auto-scaling group to configure or bootstrap it. Then make sure to use the
cfn-signal
helper script to signal when an instance creation process has been completed successfully.
Domain 6: Security and Compliance
AWS IAM
- Use IAM roles whenever possible (avoid the usage of IAM users and groups if possible)
- Requirements to set up an IAM role:
- Trust policy: Who can assume this role
- Access permission policy: What actions and resources the one assuming the role is allowed to do
Data protection at rest
- Security is important, even if questions don’t focus on security, try to target the answer with the most security (more encryption is better)
- Here are some highlights of AWS services that support encryption:
- S3 server-side encryption
- EBS service side or host encryption
- Glacier by default encrypted
- EFS supports only KMS encryption
GuardDuty
- Protect AWS accounts and workloads
- Monitors your AWS environment for suspicious activity and generates findings
- Allows you to add your own threat list and trusted IP lists
- Analyzes multiple data sources: CloudTrail, VPC flow logs, and DNS logs
AWS Config
- Track resource configuration changes
- Sends notifications or automatically remediates when changes occur
- Enables compliance monitoring and security analysis
- When you see a question about auditing or checking the state of resources. There is a huge chance that the answer contains AWS Config.
Amazon Inspector
- Agent-based solution
- Detects vulnerabilities
- Verifies security best practices
- Generates findings report
- It’s a good practice to add EC2 security assessments as part of your CI/CD pipeline
- Important: Automatic assessments run through a Lambda function + CloudWatch event
AWS Systems Manager
- Manages systems in the Cloud and on-premises, good use case for patch management.
- When you tag patch groups don’t forget that the tags are case-sensitive and you can separate patch groups based on tags.
- Automate admin tasks (state manager):
- Collect software inventory
- Apply OS patches with the patch manager
- Create system images
- Configure Windows and Linux systems
- Session manager
- Set maintenance windows
Credential storage options
- SSM Parameter Store:
- Create unique parameter names (strings, string lists, secure strings)
- Encrypt with AWS KMS
- Use API to pull parameters in your machines
- Secrets Manager:
- Encrypted via AWS KMS (costs are higher)
- Supports automated credentials rotation
- License Manager:
- Strictly for storing licenses
- Also stores information about where the license is activated
AWS Trusted Advisor
- Keep cost optimization in mind when answering the exam questions. Use tagging for cost management.
- You need to have a support plan to be able to use AWS Trusted Advisor.
You can view the check descriptions and results for the following check categories:
- Cost Optimization: Highlights underutilized resources on your account to save money.
- Performance: Recommendations that can improve the speed and responsiveness of your applications.
- Security: Finds possible security improvements e.g. enabling MFA on root user.
- Fault Tolerance: Looks for potential over-used resources and helps to increase resiliency in your AWS account.
- Service Limits: Checks if your AWS account approaches or exceeds service limits for your resources.
AWS Personal Health Dashboard
The AWS Personal Health Dashboard organizes issues into three groups: open issues, schedule changes, and other notifications. Important to know here is that you’ll get EC2 instance retirement/maintenance messages.
AWS Service Catalog
- Create and manage catalogs of approving IT services
- Limit access to underlying AWS services
- Helps with consistent governance and compliance requirements
- Enable turn-key self-service solutions for all users
EC2 instance compliance
- Shorten deploy time by creating golden AMIs with pre-defined configurations
- Bootstrap custom user data scripts
- Configuration management with Puppet, Chef, Ansible, or OpsWorks to manage configurations
AWS Certified DevOps Engineer Professional Study material
On the internet, you’ll find a lot of study material for the AWS Certified DevOps Engineer Professional exam. It can be really overwhelming if you need to search for great quality material.
Lucky for you, I’ve spent some time curating the available study material and highlighting some of the stuff worth reading.
AWS Documentation
The notes that I’ve written in the previous chapter contain keywords and summaries, don’t solely depend on that! If a concept or keyword is unknown to you then see it as an incentive to dive deeper into that topic.
Based on my experience with the exam I would recommend reading the official documentation on the following services:
- Auto Scaling – pay attention to launch templates, launch configurations, lifecycle hooks, and termination policies.
- AWS Elastic Beanstalk – pay attention to deployment methods and configurations (.ebextensions, .elasticbeanstalk configs)
- AWS Codedeploy – pay attention to Working with instances, deployment configurations, applications (App spec), and deployment groups.
- AWS CodeBuild – pay attention to code sources and build spec file setup.
- AWS CodePipeline – pay attention to pipeline structure and use cases for CodePipeline
- AWS Systems Manager – pay attention to patch baselines, maintenance windows, and run commands.
- AWS Trusted Advisor – make sure you know that you can trigger eventbridge to send the check results and apply lambda (remediation) actions to it.
- AWS CloudFormation – pay attention to best practices, template anatomy, creation policy, deletion policy, and depends on.
AWS Study guides
If you want to update your foundational knowledge I would really recommend giving these official study guides a chance (click on the book to buy):
Next to these official guides, I’ve written my own exam guides for both the Developer Associate and SysOps Administrator Associate exams. This is a condensed version that contains technical notes summarized in bullet points to quickly list the things you need to know for the exam.
AWS Whitepapers
There is one whitepaper, in particular, that’s a must-read to give you a good understanding of automation, compliance, and infrastructure as code:
You should now be fully prepared for the AWS certification exam!
After you have gone through my guide, you should be able fairly competent on doing DevOps on AWS and be ready to tackle the DevOps professional certification exam questions with ease and confidence.
If you still want to practice a little bit more then I can recommend you take a couple of practice tests before taking the real exam.