February 24, 2020     14min read

Certified Solutions Architect Professional - Study Notes


I recently passed my AWS Certified Solutions Architect Professional exam and along my journey of studying I scribbled down a bunch of notes I took. I thought it would be a good idea to share these random bits of information so that they can be used as flash cards / bite sized chunks of information to anyone else who is studying for this fairly vast examination.

AWS Certified Solutions Architect Professional Badge
AWS Certified Solutions Architect Professional Badge

Learning Options

The below information alone won't be enough to get you through. I highly recommend you study from some of the great online courses available from the providers below.


Contents


Active Directory

  • SimpleAD
    • Microsoft Active Directory compatible directory from AWS Directory Service and supports common features of an active directory.
    • Cannot connect to existing on-prem AD
  • AWS Directory Service for Microsoft Active Directory
    • Managed Microsoft Active Directory that is hosted on AWS cloud.
  • AD Connector
    • Proxy service for connecting your on-premises Microsoft Active Directory to the AWS cloud.

WorkDocs

  • Can we used to share documents via AD directory services
  • Can define time duration or passcodes to access the document

API Gateway

  • Lambda non-proxy integration flow
    • Method Request -> Integration Request -> Integration Response -> Method Response
  • Maximum integration timeout for AWS API gateway is 29 seconds
  • If you want to change the default timeout for an integration request, uncheck Use Default Timeout and change it to something else then 5 seconds
  • You can capture a response code and rewrite it to something custom via Gateway Responses

Athena

  • Serverless platform
  • Automatically executes queries in parallel
  • If asked whether to use Athena or Quicksite, look for a mention of whether the team has experience with SQL. If they do, pick Athena

Aurora

  • Can replicate from an external master instance or a MySQL DB instance on AWS RDS.
  • Aurora serverless is best suited to situations where you can’t predict what traffic will be like

Backup

  • The following services can be backed up and restored using AWS Backup
    • EFS, DynamoDB, EBS, RDS, Volume Gateway

Batch

  • Configures resources, schedules when to run data analytics workloads
  • Suitable for running a bash script using a job
  • Batch scheduler evaluates when / where / how to run jobs (no need for integration with cloudwatch events to schedule)
  • Key components
    • Jobs: unit of work (script, exec, docker container)
    • Job Definitions: specifies how a job is run
    • Job Queues: Jobs submitted are added to queues
    • Compute Environment: compute resources that run jobs
  • If your Batch jobs are stuck in RUNNABLE state check:
    • Role assigned has adequate permissions
    • CPU and RAM given as per compute allocation
    • Check EC2 limits on the account

Beanstalk

  • No concept of programmable infrastructure / Git source. Can’t do infrastructure as code end to end without other tooling.

Billing

  • Billing reports can be delivered to an S3 bucket
  • Consolidated billing is only available in master accounts (where there are children accounts under organisations). These reports include activity for all child accounts

CloudFormation

  • Retain data for S3: Set DeletionPolicy on S3 resource to retain
  • Create RDS Snapshot on delete: Set RDS resource DeletePolicy to snapshot
    • There are three options for RDS DeletePolicy: Retain, Snapshot and Delete
  • To coordinate stack creations that rely on configuration to be executed on an EC2 you should use the CreationPolicy attribute under the wait condition.
  • If you need to reference AZ info within CloudFormation templates you can make use of the Fn::GetAZs function
  • Launching EC2 instances with CloudFormation requires IAM permissions to be provided to the person creating the stack
  • Intrinsic functions can be used in Properties, Outputs, Metadata attributes and update policy attribute

CloudFront

  • Managed content delivery network (CDN)
  • S3 Transfer Acceleration can be used to distribute S3 content more efficiently globally
  • Origin Access Identity can be used to grant access to objects in s3 without having to give a bucket public access.
  • Different HTTP methods for CloudFront forwarding and there’s uses:
    • GET, HEAD: You can use CloudFront only to get objects from your origin or to get object headers.
    • GET, HEAD, OPTIONS: You can use CloudFront only to get objects from your origin, get object headers, or retrieve a list of the options that your origin server supports.
    • GET, HEAD, OPTIONS, PUT, POST, PATCH, DELETE: You can use CloudFront to get, add, update, and delete objects, and to get object headers. In addition, you can perform other POST operations such as submitting data from a web form.
  • Support for common content types such as:
    • Static content (S3 website or web assets)
    • Live events (streaming video)
    • Media content (HLS)
  • TTL can be changed on CloudFront to deliver new content immediately when it changes

CloudWatch

  • Cron event to trigger lambda
    • CloudWatch Events -> Create Rule
    • provide a valid expression (00 08 * ? )
    • Provide a target(s)
  • Can trigger a number of different services like Lambda, SNS, SQS, CodeBuild etc.
  • Cross-region dashboards are a thing that work so if you need to display metrics from different regions on one dashboard, it is possible

Data Migration Service

  • Suitable for migrating databases like MySQL to Aurora or RDS
  • Data migrated is encrypted with KMS
    • By default is uses the AWS managed aws/dms key, or a custom managed key (CMK) can be provided
  • DMS input stream can be throttled to accommodate downstream systems that can’t ingest at full speed.
    • Ingesting data to elasticsearch where indexing queue fills up

DirectConnect

  • Link aggregation groups (LAG) can bond DirectConnects together
  • Direct connect to a VPC provides access to all AZ’s
  • Maximum number of DirectConnect instances in a LAG is 4
    • All must contain the same bandwidth
  • Troubleshooting Direct Connect
    • Confirm no firewalls are blocking TCP 179 (or ephemeral ports)
    • Confirm the ASNs match on both sides

DynamoDB

  • Supports autoscaling
  • When defining primary keys, use the many to few concept
  • Supported CloudWatch metrics
    • ProvisionedWriteCapacityUnits
    • ProvisionedReadCapacityUnits
    • ConsumedWriteCapacityUnits
    • ConsumedReadCapacityUnits
  • You cannot configure On Demand for read and Provisioned for write separately. It has to be used for one or both
  • Attribute called EXPIRE can provide a way of doing TTL on items in a dynamodb table

Glue

  • Containers crawlers that connect to
    • S3, JDBC and DynamoDB
  • Glue has a central metadata repository (data catalog)
  • Fully serverless ETL

EC2

  • High network performance
    • Cluster placement groups are recommended for applications that need low network latency / high throughput between instances in the same group
    • You can detach secondary network instances when the instance is running or stopped.
      • You can’t detach the primary
  • If you need a static MAC address, you have to create an ENI (where a random one will be assigned). Then reattach that same ENI to different instances going forward.
    • There is no way to manually config a MAC address on AWS EC2
  • Reservable Instances can be pools among accounts in AWS Organisations
    • If 3 t2.mediums are purchased, and 1 is used in 1 account, 2 more could be used in other accounts within the organisation
  • Placement groups could suffer from capacity errors if you try to add new instances to the group. It's recommended to relaunch these instances again to see if you get capacity. The best thing to do is launch with the number of instances you are going to need in a placement group at the start obviously.
  • To improve high network throughput make use of Single Root I/O Virtualization (SR-IOV)
  • In order to hibernate an EC2 instances you require the following
    • Instance root has to be EBS and not instance store
    • Instance cannot be in an Autoscaling group (or used by ECS)
    • Instance root volume must be large enough so RAM can be stored
  • Hibernation of instances must also be using a HVM AMI type.
    • It has to be enabled on creation of the EC2 instance as well as be supported on your AMI
  • For specifying Drive letters on Windows instances make use of EC2Config
  • Lost your SSH keys? Two options
    • Stop the instance, detach the root volume, attach it as another volume to another EC2, modify the authorized_keys file, move the volume back to the original instance, start it.
    • Systems Manager Automation with AWSSupport-ResetAccess

AMI

  • You cannot create an AMI from an ec2 connected instance store
  • If you launch an AMI, the PEM keys will be removed, however the authorization keys will still be on the instance.
    • You need to ensure that the AMI is launched with the same PEM key

Autoscaling

  • AZRebalance will attempt to balance the number of instances in different availability zones.
  • When associating an ELB with an ASG the ASG gets awareness about unhealthy instances (and can terminate)

EBS

  • When using an encrypted EBS volume the following data is encrypted:
    • Data at rest in the volume
    • Data moving between the volume and instance
    • Snapshots created from the volume
    • Volumes created from the snapshots
  • Snapshots can be created every 2, 3, 4, 6, 8, 12, 24 hours
    • Lifecycle policies help retain backups required for compliance / audits. Also deleted unnecessary ones to save cost.
  • When using snapshots (if you don’t want downtime) don’t use RAID
  • Copies of snapshots with retention policies do not have policies carried over during copy.
  • In order to mount an EBS volume, it must be in the same AZ as the instance you are mounting to
  • Root volume can be changed without stopping the instance provided its to gp2, iot1, standard
    • sc1 or st1 cannot UNLESS they are non-root volumes (must be at least 500gb)
  • When an EBS volume has two tags, multiple lifecycle policies can run at the same time
  • Encrypted snapshots cannot be copied to non encrypted ones
  • Non encrypted snapshots can be encrypted when copying them using the --kms-key-id (with your CMK)

EFS

  • Data is distributed across multiple availability zones which provides durability and availability.
  • Supports 2 throughput modes
    • Bursting throughput: uses burst credits to determine if the filesystem can burst
    • Provisioned throughput
  • Provides both in-transit and at-rest encryption using AWS KMS
  • Mount an EFS volume with encryption in transit by
    • Getting EFS id, create mount targets for EC2 instance, use the mount helper with the -o tls flag
  • Does not support Windows-based clients
    • Storage Gateway / File Gateway is the recommendation if you need file store (using SMB mount)

Load Balancers

  • When using Network Load Balancers Secure connections should be TCP 443 with targets also using TLS (port 443)
  • When sticky sessions are needed, it's usually recommended to use ElastiCache to store session state
    • You don’t want to bind a user to a particular instance under a load balancer
    • Requires code to retrieve session state from ElastiCache
  • If you need to get the client IP when using a Classic Load Balancer:
    • TCP: configure proxy protocol to pass the IP address in a new TCP header
    • HTTP: send the client IP in the x-forward-for header
  • Cross-zone load balancing can be enabled to spread requests across your AZ
  • If a static IP is needed with a load balancer, provision a NLB with an attached EIP
  • Application Load Balancers support SNI
    • Is able to deal with multiple SSL certificates per listener

ElastiCache

Redis

  • Can only be upgraded, cannot be downgraded

IAM

  • AssumeRole can be secured down with an ExternalId
  • Flow for using a custom identity system
    • Custom identity broker app, this authenticates the user
    • Uses GetFederationToken API and passes a permission policy to get temp credentials from STS
    • Alternatively can call AssumeRole API to get temp access using role-based access instead

Kinesis

  • Ideal for real-time data ingestion

Kinesis Data Streams

  • Can store records in order and replay them in the same order later (up to 7 days)
    • Makes it ideal for financial transactions
  • Able to have multiple applications consume from the same stream concurrently.

Kinesis Video Streams

  • HLS can be used for live playback
    • Use GetHLSStreamingSessionURL and then use the resulting URL in the video player of your choice
  • Content delivery typically leverages AWS Elemental MediaLive / MediaPackage and CloudFront to distribute content globally
  • Can view either Live or archived video

KMS

  • Two types of keys
    • Master keys: used directly to encrypt and decrypt up to 4 kilobytes of data and can also protect data keys
    • Data keys: used to encrypt and decrypt customer data
  • If you are accessing a very large number of KMS encrypted files at a time there is a chance you will hit the KMS encrypt request account limit. You might need to open a support case to resolve
  • Grants in KMS
    • Dynamically / programmatically revoke a key after its use.
    • Better then changing roles / policies

Managed Blockchain

  • Supported frameworks include Hyperledger Fabric and Ethereum.
  • If you have members who would like to deploy their own blockchain networks they can use the CloudFormation templates to support ECS clusters or EC2 instances

Migration Hub

  • AWS Discovery Agent can transmit to Migration hub, then Data exploration can be done in Athena
  • Agentless migrations can only pull information like RAM or Disk I/O from VMware
  • If your OS isn’t supports for import, you can provide the details yourself via import template
  • Migration steps from VMware
    • Schedule migration job
    • Upload your VMDK and then convert it to and EBS snapshot
    • Create an AMI from the snapshot

OpsWorks

  • Can be managed by CloudFormation AWS::OpsWorks::Stack
    • This can be part of a nested stack with a parent containing all the VPC, NAT Gateway etc. resources
  • Lifecycle events:
    • Setup, Configure, Deploy, Undeploy, Shutdown
  • Handles autohealing of instances
  • Bluegreen style deployments can be accomplished by creating a new stack with identical configuration
    • This can be used when making updates to AMIs
  • Process for deploying with AWS CodePipeline
    • Create stack, layer and instance in a OpsWorks Stack
    • Upload app code to bucket, then add your app to OpsWorks stack
    • Create a pipeline (run it), verify the app deployment in OpsWorks stack
  • Process for updating OpsWork stacks to the latest security patches
    • Run update dependencies stack command
    • Create new instances to replace the only ones
  • When you attach a load balancer to a layer
    • Deregisters currently registered instances
    • Re-registers layer instances when they come online (removes offline ones)
    • Handles the starting of routing requests to the registered instances

Organizations

  • You may only join one organization (even if you receive more than one invite)
  • Invitations expire after 15 days
  • To resend an invite, you must cancel the pending one, then create a new invitation
  • In order to move an account to a different OU you need the following permissions
    • organizations:MoveAccount
    • organizations:DescribeOrganization
  • Accounts can be dragged into different OU’s, however OU’s can’t be dragged around to new locations in the organization's structure.
    • Instead you must create new OU’s and then reassign any SCP’s you had inplace. Then move the accounts to these OU’s again.
  • If you want to block access to unused services, check the IAM Activity for services (never used, last used date) and base your blocks on this information
  • SCP’s can only be Deny (not allow)
    • Explicite denies will always overrule explicit allows
  • To apply WAF rules across an organization make use of AWS Firewall Manager
  • You cannot restrict a member account from the ability to change its root password or manage MFA settings
  • Improve consolidated billing by also tagging resources
    • This will group expenses on the detailed billing report
  • To access a member account
    • Use sts:AssumeRole with OrganizationAccountAccessRole
  • The master account isn't impacted by SCPs

Redshift

  • Does not have read replicas
  • Queries cannot be paused in Redshift
  • Use redshift workload management groups
    • Priorities of these workloads can be assigned to these groups
  • Can create single node cluster via CLI (and in Console as of recently)
  • Using the RedshiftCommands.sql file from the Billing section of your account you can analyse billing reports.
  • Redshift snapshots are a very expensive solution normally, so if cost is important, don’t select anything to do with snapshots.
    • Snapshots on redshift could be pointless too if you can repopulate all the data in the cluster with S3 instead

RDS

  • When a primary DB instance fails in a multi-AZ deployment, the CNAME is changed from primary to standby so there’s no need to change a reference to the other DB in code.
  • Multi-AZ replication is done synchronously
  • For redundant architectures Multi-AZ support is used
  • Read-replicas aren’t used for redundancy, they are used to improve performance.
  • If Encryption is enabled on the RDS instance that:
    • Encrypts the underlying storing
    • Defaults to also encrypting the snapshots as they are created
  • RDS does not support Oracle RAC
  • RDS Oracle can read/write from S3 directly.
    • Option groups should have a role with permissions to access S3
    • Feature S3_INTEGRATION
  • If there is an RDS update available that you aren’t ready to apply, you can Defer the updates indefinitely until you are ready.
  • Read Replicas require access to backups for maintaining their read replica logs. This means if you want to disable automatic backups you must remove all Read Replicas first.

RDS for Oracle

  • Supported backup / restore options
    • Oracle Import/Export
    • Oracle Data Pump Import/Export
    • RDS Snapshot / point in time recover

RDS VMware

  • Manages:
    • Patching
    • Multi AZ configurations
    • Backups based on retention policies
    • Point-in-time restores (from on-prem or cloud backups)

Route53

  • Latency based routing
    • Redirect requests to nearest region
  • If you have issues with route53 not routing to ‘live’ hosts, check to make sure you have “Evaluate Target Health” set to “Yes” on the latency alias. Same goes with HTTP health checks on weighted resources.
  • Resolve two domains to one domain (test1.example.com, test2.example.com -> test3.example.com)
    • CNAME for the records test1.example.com, test2.example.com to test3.example.com
  • Resolve a DNS entry to an ALB
    • Alias record test3.example.com to ALB address

S3

  • Using the x-amz-server-side-encryption request header when making an API call will ensure an object is server side encrypted (SSE)
  • If versioning is enabled on S3 after objects are already put in, those objects with have a version ID of null
  • Referrer keys in a bucket policy can make sure requests to Objects come from a domain you operate
  • INTELLIGENT_TIERING storing class is used to optimize storage costs automatically for you

SQS

  • Message group ID can be used on FIFO delivery to ensure messages that belong to the same message group are always processed one by one.
    • E.g. binding platform with multiple products, FIFO and a message group based on the product being bid on
  • Dead-letter queues need to match the queue they are set up for. So a standard SQS queue needs to use a standard dead-letter queue (not FIFO)

Systems Manager

  • Troubleshooting why you can’t Run commands on a SSM host
    • Check the latest SSM Agent is installed on the instances
    • Verify the instances has an IAM role that lets its talk to SSM API
  • Services that can have costs associated to them
    • On-Premises Instance Management: pay-as-you-go pricing
    • Parameter Store: calling API costs
    • System Manager Automation
  • Schedule log file copying from hosts
    • State Manager to run a script at a given time
    • Schedule in Maintenance Windows for the log file moves
  • Patch management can be applied to instances using the following methods
    • Tag key/value pairs that identify the resources
    • Patch groups, where a group requires a particular tag
    • Manual selection of the hosts to patch

VPC

  • You cannot create subnets with overlapping CIDR ranges, you’ll get an error on trying to create.
  • VPC subnets will have 5 reserved addresses
    • 10.0.0.0: Network address.
    • 10.0.0.1: Reserved by AWS for the VPC router.
    • 10.0.0.2: Reserved by AWS. The IP address of the DNS server is the base of the VPC network range plus two.
    • 10.0.0.3: Reserved by AWS for future use.
    • 10.0.0.255: Network broadcast address (but no broadcast supported).
  • When wanting to make changes to a DHCP option set, you must create a new one then assiate it to your VPC replacing the old one.
  • Troubleshooting EC2 in VPC unable to talk to data-center over Direct connect?
    • Make sure route propagation to the Virtual Private Gateway (VGW) is setup
    • Make sure the IPv4 dest address that routes the traffic over the VGW as a prefix you want to advertise
  • Sharing a SaaS product out via your VPC to customers can be done via AWS endpoint service (PrivateLink) to other customers VPC’s
    • Customers need to use an interface VPC endpoint on their end.
  • Options for sharing an application running in a shared VPC within Organization
    • VPN between two VPCs
    • Use AWS Resource Access Manager to share subnets within the account

VPN

  • Creating a VPN connection requires the static IP of the customer gateway device
    • With dynamic routing type, a Autonomous System Number (ASN) is also required
  • An option for if you need Multicask in a VPC is to build a virtual overlay network

Endpoints

  • Provide a secure link to access AWS resources from a VPC

NAT Gateway

  • Used to communicate with the internet via a private subnet
    • Secure private resources like Databases and Application servers that shouldn’t have public connectivity

X-Ray

  • Segments allow for detailed tracing
  • Annotations can help find specific areas of the application in the tracing records (isolate the issues / impact area)

Miscellaneous


Below is a set of random pieces of information that didn't really need it own section.

  • IPS/IDS systems within VPC
    • Configure to listen / block suspected bad traffic in and out of VPC
    • The system could be Palo Alto networks
    • Monitors, alerts and filters on potential bad traffic sent in / out of VPC.
  • Reducing DDOS surface area
    • Remove non-critical internet entry IPs
    • Configure ELB to auto-scale
  • Rekognition CLI example for detecting faces
    • aws rekognition detect-faces
  • SAML identity provider in IAM
    • SAML metadata document from the identity provider
    • Create a SAML IAM identity provider in AWS
    • Configure the SAML Identity provider with relying party trust
    • In Identity provider configure SAML assertions for auth response
  • AWS has its own ways of protecting customers from DDoS
    • If you are trying to flood a connection, or running a pentest you will likely find that you’ll be blocked by AWS
    • You need to notify and have AWS grant you permission if you are running pentesting jobs
  • Want to access Support Ticket API?
    • You need Business support plans
  • Alexa for Business
    • You can have Alexa devices perform tasks for staff (getting info for them, booking meetings)

Summary


Did I miss something you think I should include? Please reach out to me on Twitter @nathangloverAUS and let me know!

devopstar

DevOpStar by Nathan Glover | 2020