Cloud Engineering
From the NIST definition to AWS, Docker, K8s and IaC — built for the real job.
About Cloud Engineering
Cloud Engineering is the discipline of designing, deploying and operating systems on shared, on-demand infrastructure (AWS, Azure, GCP, private clouds). This course is structured like a real bootcamp: each module starts with concepts (with reading + quizzes to make sure you actually understood the material), then moves to hands-on tasks where you write the real commands and config files engineers use every day — docker, kubectl, aws, terraform, ssh, git, dig, curl — all in a simulated terminal in the right pane.
Quick-reference cheat sheet
# Cloud Engineering — quick reference
# NIST essentials # Service models # Deployment models
on-demand self-service IaaS raw VMs public AWS/Azure/GCP
broad network access PaaS managed runtime private single-org
resource pooling SaaS finished app hybrid public + private
rapid elasticity FaaS functions on-demand community shared concern
measured service
# docker # kubectl # terraform
docker pull nginx kubectl get pods terraform init
docker run -d -p 8080:80 nginx kubectl create deploy ... terraform plan
docker ps kubectl scale deploy/x ... terraform apply
docker logs <id> kubectl logs <pod> terraform destroy
docker exec -it <id> sh kubectl apply -f file.yaml
docker compose up kubectl describe pod ...
# aws # networking
aws sts get-caller-identity ping host
aws s3 mb s3://bucket curl -I https://host
aws s3 ls dig host
aws ec2 describe-instances ip addr
ss -ltn
Tasks
- 01introWhat is 'the cloud', really?Read the definition, then answer the quiz.
- 02introThe 5 essential characteristicsIdentify the missing NIST characteristic.
- 03introIaaS vs PaaS vs SaaSMatch the service model to the example.
- 04introSpot the SaaSPick the SaaS product.
- 05introPublic, private, hybrid, communityRead the four deployment models, then answer.
- 06introWhat does 'multi-tenancy' mean?Pick the best definition.
- 07introElasticity vs scalabilityTwo words people confuse all the time.
- 08introCapEx vs OpExWhy finance teams love the cloud.
- 09easyThe shared responsibility modelWho patches the OS on an EC2 instance?
- 10easyRegions and Availability ZonesGeography matters.
- 11easyThe big three hyperscalersName the AWS service for virtual machines.
- 12easyServerless and FaaSThere ARE servers. You just don't see them.
- 13easyVendor lock-inThe hidden cost of going all-in.
- 14easyTotal Cost of OwnershipThe cloud bill is not the only cost.
- 15mediumReading 'the nines'How much downtime is '99.9%'?
- 16mediumRTO vs RPOTwo numbers every DR plan needs.
- 17mediumData residency and sovereigntyWhy a Swiss hospital can't just use 'us-east-1'.
- 18mediumPutting it all togetherOne scenario, one best-fit model.
- 19introCheck the AWS CLI versionRun `aws --version` in the terminal.
- 20introConfigure the AWS CLIRun `aws configure` to set credentials.
- 21introWho am I in AWS?Find your AWS account ID.
- 22easyList available regionsUse `aws ec2 describe-regions`.
- 23easyIAM best practicePick the BEST identity strategy.
- 24easyGenerate an SSH key pairRun `ssh-keygen` to create a key.
- 25easyLaunch an EC2 instance from the CLIUse `aws ec2 run-instances`.
- 26easyList your EC2 instancesUse `aws ec2 describe-instances`.
- 27easyCreate your first S3 bucketUse `aws s3 mb`.
- 28easyList S3 bucketsRun `aws s3 ls` and find your bucket.
- 29easyAvoiding bill shockPick the best 'don't bankrupt me' setup.
- 30easyWhat is CloudShell?Pick the right description.
- 31easyTag everything from day oneWhy tags matter.
- 32easyMFA, immediatelyMulti-factor auth basics.
- 33mediumTear down what you createdCreate a bucket then remove it.
- 34easyOSI layers in 30 secondsWhich layer is HTTP?
- 35easyPing a hostUse ping to check reachability.
- 36easyInspect HTTP response headersUse `curl -I`.
- 37easyResolve a name with digUse dig to query DNS.
- 38easyTrace the path to a hostUse traceroute.
- 39easyShow your network interfacesUse `ip addr`.
- 40easyWhich ports are listening?Use `ss -ltn`.
- 41mediumCIDR notationRead the slash.
- 42mediumVPCs, subnets, AZsHow AWS networks are structured.
- 43mediumSecurity groups vs NACLsStateful or stateless?
- 44mediumPick the right AWS load balancerALB vs NLB.
- 45mediumTLS in one minuteWhat does HTTPS actually give you?
- 46mediumcurl over HTTPSHit https://google.com and check headers.
- 47mediumDNS record typesA vs CNAME vs MX vs TXT.
- 48mediumCDNs, in one breathWhy put CloudFront/Cloudflare in front?
- 49hardDebugging 'I can't reach my service'Build the diagnosis chain.
- 50introVerify Docker is installedRun `docker --version`.
- 51introHello, DockerRun the classic hello-world image.
- 52introPull the nginx imageUse `docker pull`.
- 53introList local imagesPull nginx then list images.
- 54easyRun a container in the backgroundUse -d (detach), --name, -p (port).
- 55easyList running containersStart one then `docker ps`.
- 56easyRead container logsUse `docker logs`.
- 57easyExec into a containerRun a shell command inside.
- 58easyStop and remove a containerClean up after yourself.
- 59easyWrite your first DockerfileCreate a Dockerfile in /home/user.
- 60easyBuild an image from a DockerfileUse `docker build -t`.
- 61mediumTag and push an imageRe-tag for a registry.
- 62mediumCreate a named volumeUse `docker volume create`.
- 63mediumCreate a user-defined networkUse `docker network create`.
- 64mediumInspect a containerUse `docker inspect`.
- 65mediumBring up a stack with composeUse `docker compose up`.
- 66mediumHow Docker images are storedPick the right model.
- 67mediumVMs vs containersWhy containers feel so cheap.
- 68hardMulti-stage buildsWhy two FROMs in one Dockerfile?
- 69hardCleanup: stop & remove everythingCombine pipelines.
- 70introkubectl versionVerify kubectl is installed.
- 71introInspect the clusterUse `kubectl cluster-info`.
- 72introWhich cluster am I on?kubectl config current-context.
- 73easyCreate a namespaceGroup resources logically.
- 74easyCreate a deploymentUse `kubectl create deployment`.
- 75easyList podsAfter creating a deployment, list its pods.
- 76easyScale a deploymentUse `kubectl scale`.
- 77mediumExpose a deployment as a serviceUse `kubectl expose`.
- 78mediumService types: ClusterIP / NodePort / LoadBalancerPick the right one for the job.
- 79mediumRolling restartRestart pods without downtime.
- 80mediumCheck a rolloutUse `kubectl rollout status`.
- 81mediumdescribe is your best friendUse `kubectl describe`.
- 82mediumCreate a ConfigMapExternalize config from images.
- 83mediumCreate a SecretSame as ConfigMap, but for sensitive values.
- 84mediumDeclarative: write YAML, applyCreate a deployment YAML and apply it.
- 85mediumInstall a chart with HelmUse `helm install`.
- 86mediumWhy not just create Pods directly?Pick the right reason.
- 87hardA pod is CrashLoopBackOffFirst commands you reach for?
- 88easyWrite a small bash scriptCreate a hello.sh.
- 89easyLoop over numbersUse `seq` and a `for` loop.
- 90easyPipes and filteringPipe one command into another.
- 91easyBulk-create with a loop (concept)Pick the safest approach.
- 92easyCron schedule syntaxWhat does '0 * * * *' mean?
- 93mediumAlways start scripts with `set -euo pipefail`Defensive bash.
- 94mediumVariables and substitutionUse a variable.
- 95mediumRedirect stdout to a fileUse `>` and `>>`.
- 96mediumFind files by nameUse `find` to locate logs.
- 97mediumGrep through logsFind ERROR lines in a log file.
- 98hardTop sources from a logsort | uniq -c | sort -n.
- 99hardcurl + jq patternWhy this combo?
- 100hardAWS + bash: clean up old bucketsPattern, not exact syntax.
- 101hardMakefile-style task runnersWhy a Makefile / Justfile in an infra repo?
- 102hardIdempotencyWhat's an idempotent script?
- 103easyWhy Infrastructure as Code?Pick the BEST motivation.
- 104easyCheck Terraform versionRun `terraform version`.
- 105easyInitialize a Terraform working dirRun `terraform init`.
- 106easySee what would change with `plan`init then plan.
- 107mediumApply the planinit → plan → apply.
- 108mediumWhat's in my state?List managed resources.
- 109mediumTear it all downUse `terraform destroy`.
- 110mediumFormat and validate configfmt + validate.
- 111mediumWhere should state live?Local vs remote.
- 112mediumModulesWhy factor IaC into modules?
- 113mediumWorkspaces / environmentsPer-environment state.
- 114hardDriftWhat is configuration drift?
- 115hardImperative vs declarativeWhich is Terraform?
- 116hardSecrets in TerraformDon't commit secrets.
- 117hardCapstone: end-to-end IaC flowFull lifecycle in one script.
- 118easyAWS account boundaryPick the right unit of isolation.
- 119easyPick an EC2 instance familyC, M, R, T... what?
- 120easyS3 storage classesPick the cheapest for archival.
- 121mediumAn IAM policy in 4 linesPick the right policy shape.
- 122mediumRoles vs usersWhich is for machines?
- 123mediumList IAM usersUse `aws iam list-users`.
- 124mediumInvoke a LambdaUse `aws lambda invoke`.
- 125mediumManaged RDS vs self-hosted DBWhat does RDS take off your plate?
- 126hardConnecting VPCsPeering vs Transit Gateway.
- 127hardCloudWatch vs CloudTrailTwo services people confuse all the time.
- 128hardSpot instancesWhen are Spots a good idea?
- 129hardWell-Architected pillarsName a pillar that's NOT real.
- 130hardEvent-driven on AWSS3 → Lambda is a pattern.
- 131hardPick a region for an EU appLatency + residency.
- 132hardAWS cost levers (final)Pick the BIGGEST levers.
- 133easyWhat does CI actually mean?Continuous Integration vs Continuous Delivery vs Deployment.
- 134introInitialize a git repoRun `git init` then check the status.
- 135easyStage and commit a fileCreate README.md, add it, commit it.
- 136easyTrunk-based vs Git FlowPick the model that suits modern CI/CD best.
- 137mediumWrite a GitHub Actions workflowCreate .github/workflows/ci.yml with a build job.
- 138easyOrder the pipeline stagesWhich order is correct?
- 139mediumBuild a CI imageCreate a Dockerfile, then `docker build` it.
- 140mediumTag the image with a versionUse `docker tag` to create a versioned tag.
- 141mediumPush to a registry`docker push` your image.
- 142easyWhere do CI secrets belong?Pick the safe option.
- 143easyWhy cache dependencies?Pick the main reason.
- 144mediumDeployment strategiesMatch the description to the strategy.
- 145mediumTrigger a Kubernetes rolloutUpdate an image, watch the rollout.
- 146mediumRollback a bad deploy`kubectl rollout undo`.
- 147hardWhat is GitOps?Pick the best definition.
- 148easyThe three pillars of observabilityLogs, metrics, traces.
- 149easyMonitoring vs ObservabilityWhat's the real difference?
- 150easyScrape a /metrics endpointUse `curl` to hit a Prometheus endpoint.
- 151mediumPrometheus metric typesCounter, gauge, histogram, summary.
- 152mediumRead a PromQL queryWhat does this query mean?
- 153easyTail pod logs`kubectl logs` is your friend.
- 154easyDescribe a pod for events`kubectl describe` shows events too.
- 155mediumSLI vs SLO vs SLAPick the right definition pair.
- 156mediumError budgetsWhat is an error budget?
- 157easyThe four golden signalsFrom the Google SRE book.
- 158mediumDistributed tracing basicsSpans, traces, context propagation.
- 159easyStructured logsJSON > free text.
- 160mediumLiveness vs readiness probesK8s probes — what's the difference?
- 161easyTest a /healthz endpoint`curl -I` for headers only.
- 162hardAlerting on symptoms vs causesWhat should page you at 3 AM?
- 163mediumWhat is a service mesh?Pick the best definition.
- 164mediumThe sidecar patternWhere does the proxy run?
- 165mediumWhy mTLS between services?Mutual TLS, not just TLS.
- 166mediumInstall Istio (simulated)Add the Istio Helm repo.
- 167mediumEnable sidecar injection on a namespaceLabel the namespace for auto-injection.
- 168hardTraffic splitting / canaryWhat primitive does the mesh use?
- 169mediumRetries and timeouts in the meshWhy move them OUT of the app?
- 170mediumFree observability from the meshWhat do you get for free?
- 171hardAuthorizationPolicyZero-trust between services.
- 172mediumIngress gateway vs Service meshEdge vs east-west.
- 173easyHit an internal serviceUse `curl` against a ClusterIP DNS name.
- 174mediumLinkerd vs Istio (high level)When to pick which?
- 175hardSidecarless / ambient meshWhy move away from sidecars?
- 176mediumWhen NOT to use a service meshTradeoffs are real.
- 177hardGateway APISuccessor to Ingress.
- 178easyShared responsibility (revisited)Who patches the EC2 OS?
- 179easyLeast privilegePick the safer IAM policy.
- 180easyNever commit AWS keysWhat to do INSIDE EC2 / Lambda?
- 181easySecrets storageWhere do DB passwords belong?
- 182mediumRotate an access key (drill)List, then create — never delete first.
- 183easyMFA on the root userPick the right answer.
- 184mediumEncryption at restEBS / RDS / S3.
- 185easyTLS everywhereIn-transit encryption.
- 186mediumCloudTrail = the audit logWhat does CloudTrail log?
- 187mediumS3 public bucket disasterHow do you prevent leaks?
- 188mediumScan a container imageWhy scan images in CI?
- 189mediumK8s NetworkPolicyDefault-deny pod traffic.
- 190mediumPod security: non-rootsecurityContext basics.
- 191hardCompliance frameworks 101Match the framework to the scope.
- 192hardIncident response basicsFirst step on a confirmed breach.
- 193easyWhat is FinOps?Pick the best definition.
- 194mediumFinOps phasesInform → Optimize → Operate.
- 195easyCost allocation tagsWithout tags, you're flying blind.
- 196mediumRight-sizingPick the data-driven approach.
- 197hardReserved Instances vs Savings Plans vs SpotMatch the workload to the pricing model.
- 198hardHow much to commit?Don't over-commit.
- 199mediumS3 storage class lifecycleCold data → cheaper class.
- 200easyAudit your S3 bucketsRun `aws s3 ls`.
- 201mediumHunt orphaned resourcesWhat pays AWS for nothing?
- 202hardWatch the egress billWhere does data transfer hurt?
- 203mediumWhy a CDN saves moneyCloudFront / Cloudflare in front of S3.
- 204mediumAuto-scale instead of overprovisioningWhy does autoscaling save money?
- 205easyTurn off dev at nightEasy 65% saving on non-prod.
- 206easyBudgets and alertsDon't get a $40,000 surprise.
- 207hardUnit economics (final boss)Cost per WHAT?
