$ cd ../ (all courses)

Cloud Engineering

From the NIST definition to AWS, Docker, K8s and IaC — built for the real job.

About Cloud Engineering

Cloud Engineering is the discipline of designing, deploying and operating systems on shared, on-demand infrastructure (AWS, Azure, GCP, private clouds). This course is structured like a real bootcamp: each module starts with concepts (with reading + quizzes to make sure you actually understood the material), then moves to hands-on tasks where you write the real commands and config files engineers use every day — docker, kubectl, aws, terraform, ssh, git, dig, curl — all in a simulated terminal in the right pane.

Quick-reference cheat sheet
# Cloud Engineering — quick reference

# NIST essentials              # Service models             # Deployment models
on-demand self-service          IaaS  raw VMs                public    AWS/Azure/GCP
broad network access            PaaS  managed runtime        private   single-org
resource pooling                SaaS  finished app           hybrid    public + private
rapid elasticity                FaaS  functions on-demand    community shared concern
measured service

# docker                        # kubectl                    # terraform
docker pull nginx               kubectl get pods             terraform init
docker run -d -p 8080:80 nginx  kubectl create deploy ...    terraform plan
docker ps                       kubectl scale deploy/x ...   terraform apply
docker logs <id>                kubectl logs <pod>           terraform destroy
docker exec -it <id> sh         kubectl apply -f file.yaml
docker compose up               kubectl describe pod ...

# aws                           # networking
aws sts get-caller-identity     ping host
aws s3 mb s3://bucket           curl -I https://host
aws s3 ls                       dig host
aws ec2 describe-instances      ip addr
                                ss -ltn

Tasks

  1. 01
    What is 'the cloud', really?
    Read the definition, then answer the quiz.
    intro
  2. 02
    The 5 essential characteristics
    Identify the missing NIST characteristic.
    intro
  3. 03
    IaaS vs PaaS vs SaaS
    Match the service model to the example.
    intro
  4. 04
    Spot the SaaS
    Pick the SaaS product.
    intro
  5. 05
    Public, private, hybrid, community
    Read the four deployment models, then answer.
    intro
  6. 06
    What does 'multi-tenancy' mean?
    Pick the best definition.
    intro
  7. 07
    Elasticity vs scalability
    Two words people confuse all the time.
    intro
  8. 08
    CapEx vs OpEx
    Why finance teams love the cloud.
    intro
  9. 09
    The shared responsibility model
    Who patches the OS on an EC2 instance?
    easy
  10. 10
    Regions and Availability Zones
    Geography matters.
    easy
  11. 11
    The big three hyperscalers
    Name the AWS service for virtual machines.
    easy
  12. 12
    Serverless and FaaS
    There ARE servers. You just don't see them.
    easy
  13. 13
    Vendor lock-in
    The hidden cost of going all-in.
    easy
  14. 14
    Total Cost of Ownership
    The cloud bill is not the only cost.
    easy
  15. 15
    Reading 'the nines'
    How much downtime is '99.9%'?
    medium
  16. 16
    RTO vs RPO
    Two numbers every DR plan needs.
    medium
  17. 17
    Data residency and sovereignty
    Why a Swiss hospital can't just use 'us-east-1'.
    medium
  18. 18
    Putting it all together
    One scenario, one best-fit model.
    medium
  19. 19
    Check the AWS CLI version
    Run `aws --version` in the terminal.
    intro
  20. 20
    Configure the AWS CLI
    Run `aws configure` to set credentials.
    intro
  21. 21
    Who am I in AWS?
    Find your AWS account ID.
    intro
  22. 22
    List available regions
    Use `aws ec2 describe-regions`.
    easy
  23. 23
    IAM best practice
    Pick the BEST identity strategy.
    easy
  24. 24
    Generate an SSH key pair
    Run `ssh-keygen` to create a key.
    easy
  25. 25
    Launch an EC2 instance from the CLI
    Use `aws ec2 run-instances`.
    easy
  26. 26
    List your EC2 instances
    Use `aws ec2 describe-instances`.
    easy
  27. 27
    Create your first S3 bucket
    Use `aws s3 mb`.
    easy
  28. 28
    List S3 buckets
    Run `aws s3 ls` and find your bucket.
    easy
  29. 29
    Avoiding bill shock
    Pick the best 'don't bankrupt me' setup.
    easy
  30. 30
    What is CloudShell?
    Pick the right description.
    easy
  31. 31
    Tag everything from day one
    Why tags matter.
    easy
  32. 32
    MFA, immediately
    Multi-factor auth basics.
    easy
  33. 33
    Tear down what you created
    Create a bucket then remove it.
    medium
  34. 34
    OSI layers in 30 seconds
    Which layer is HTTP?
    easy
  35. 35
    Ping a host
    Use ping to check reachability.
    easy
  36. 36
    Inspect HTTP response headers
    Use `curl -I`.
    easy
  37. 37
    Resolve a name with dig
    Use dig to query DNS.
    easy
  38. 38
    Trace the path to a host
    Use traceroute.
    easy
  39. 39
    Show your network interfaces
    Use `ip addr`.
    easy
  40. 40
    Which ports are listening?
    Use `ss -ltn`.
    easy
  41. 41
    CIDR notation
    Read the slash.
    medium
  42. 42
    VPCs, subnets, AZs
    How AWS networks are structured.
    medium
  43. 43
    Security groups vs NACLs
    Stateful or stateless?
    medium
  44. 44
    Pick the right AWS load balancer
    ALB vs NLB.
    medium
  45. 45
    TLS in one minute
    What does HTTPS actually give you?
    medium
  46. 46
    curl over HTTPS
    Hit https://google.com and check headers.
    medium
  47. 47
    DNS record types
    A vs CNAME vs MX vs TXT.
    medium
  48. 48
    CDNs, in one breath
    Why put CloudFront/Cloudflare in front?
    medium
  49. 49
    Debugging 'I can't reach my service'
    Build the diagnosis chain.
    hard
  50. 50
    Verify Docker is installed
    Run `docker --version`.
    intro
  51. 51
    Hello, Docker
    Run the classic hello-world image.
    intro
  52. 52
    Pull the nginx image
    Use `docker pull`.
    intro
  53. 53
    List local images
    Pull nginx then list images.
    intro
  54. 54
    Run a container in the background
    Use -d (detach), --name, -p (port).
    easy
  55. 55
    List running containers
    Start one then `docker ps`.
    easy
  56. 56
    Read container logs
    Use `docker logs`.
    easy
  57. 57
    Exec into a container
    Run a shell command inside.
    easy
  58. 58
    Stop and remove a container
    Clean up after yourself.
    easy
  59. 59
    Write your first Dockerfile
    Create a Dockerfile in /home/user.
    easy
  60. 60
    Build an image from a Dockerfile
    Use `docker build -t`.
    easy
  61. 61
    Tag and push an image
    Re-tag for a registry.
    medium
  62. 62
    Create a named volume
    Use `docker volume create`.
    medium
  63. 63
    Create a user-defined network
    Use `docker network create`.
    medium
  64. 64
    Inspect a container
    Use `docker inspect`.
    medium
  65. 65
    Bring up a stack with compose
    Use `docker compose up`.
    medium
  66. 66
    How Docker images are stored
    Pick the right model.
    medium
  67. 67
    VMs vs containers
    Why containers feel so cheap.
    medium
  68. 68
    Multi-stage builds
    Why two FROMs in one Dockerfile?
    hard
  69. 69
    Cleanup: stop & remove everything
    Combine pipelines.
    hard
  70. 70
    kubectl version
    Verify kubectl is installed.
    intro
  71. 71
    Inspect the cluster
    Use `kubectl cluster-info`.
    intro
  72. 72
    Which cluster am I on?
    kubectl config current-context.
    intro
  73. 73
    Create a namespace
    Group resources logically.
    easy
  74. 74
    Create a deployment
    Use `kubectl create deployment`.
    easy
  75. 75
    List pods
    After creating a deployment, list its pods.
    easy
  76. 76
    Scale a deployment
    Use `kubectl scale`.
    easy
  77. 77
    Expose a deployment as a service
    Use `kubectl expose`.
    medium
  78. 78
    Service types: ClusterIP / NodePort / LoadBalancer
    Pick the right one for the job.
    medium
  79. 79
    Rolling restart
    Restart pods without downtime.
    medium
  80. 80
    Check a rollout
    Use `kubectl rollout status`.
    medium
  81. 81
    describe is your best friend
    Use `kubectl describe`.
    medium
  82. 82
    Create a ConfigMap
    Externalize config from images.
    medium
  83. 83
    Create a Secret
    Same as ConfigMap, but for sensitive values.
    medium
  84. 84
    Declarative: write YAML, apply
    Create a deployment YAML and apply it.
    medium
  85. 85
    Install a chart with Helm
    Use `helm install`.
    medium
  86. 86
    Why not just create Pods directly?
    Pick the right reason.
    medium
  87. 87
    A pod is CrashLoopBackOff
    First commands you reach for?
    hard
  88. 88
    Write a small bash script
    Create a hello.sh.
    easy
  89. 89
    Loop over numbers
    Use `seq` and a `for` loop.
    easy
  90. 90
    Pipes and filtering
    Pipe one command into another.
    easy
  91. 91
    Bulk-create with a loop (concept)
    Pick the safest approach.
    easy
  92. 92
    Cron schedule syntax
    What does '0 * * * *' mean?
    easy
  93. 93
    Always start scripts with `set -euo pipefail`
    Defensive bash.
    medium
  94. 94
    Variables and substitution
    Use a variable.
    medium
  95. 95
    Redirect stdout to a file
    Use `>` and `>>`.
    medium
  96. 96
    Find files by name
    Use `find` to locate logs.
    medium
  97. 97
    Grep through logs
    Find ERROR lines in a log file.
    medium
  98. 98
    Top sources from a log
    sort | uniq -c | sort -n.
    hard
  99. 99
    curl + jq pattern
    Why this combo?
    hard
  100. 100
    AWS + bash: clean up old buckets
    Pattern, not exact syntax.
    hard
  101. 101
    Makefile-style task runners
    Why a Makefile / Justfile in an infra repo?
    hard
  102. 102
    Idempotency
    What's an idempotent script?
    hard
  103. 103
    Why Infrastructure as Code?
    Pick the BEST motivation.
    easy
  104. 104
    Check Terraform version
    Run `terraform version`.
    easy
  105. 105
    Initialize a Terraform working dir
    Run `terraform init`.
    easy
  106. 106
    See what would change with `plan`
    init then plan.
    easy
  107. 107
    Apply the plan
    init → plan → apply.
    medium
  108. 108
    What's in my state?
    List managed resources.
    medium
  109. 109
    Tear it all down
    Use `terraform destroy`.
    medium
  110. 110
    Format and validate config
    fmt + validate.
    medium
  111. 111
    Where should state live?
    Local vs remote.
    medium
  112. 112
    Modules
    Why factor IaC into modules?
    medium
  113. 113
    Workspaces / environments
    Per-environment state.
    medium
  114. 114
    Drift
    What is configuration drift?
    hard
  115. 115
    Imperative vs declarative
    Which is Terraform?
    hard
  116. 116
    Secrets in Terraform
    Don't commit secrets.
    hard
  117. 117
    Capstone: end-to-end IaC flow
    Full lifecycle in one script.
    hard
  118. 118
    AWS account boundary
    Pick the right unit of isolation.
    easy
  119. 119
    Pick an EC2 instance family
    C, M, R, T... what?
    easy
  120. 120
    S3 storage classes
    Pick the cheapest for archival.
    easy
  121. 121
    An IAM policy in 4 lines
    Pick the right policy shape.
    medium
  122. 122
    Roles vs users
    Which is for machines?
    medium
  123. 123
    List IAM users
    Use `aws iam list-users`.
    medium
  124. 124
    Invoke a Lambda
    Use `aws lambda invoke`.
    medium
  125. 125
    Managed RDS vs self-hosted DB
    What does RDS take off your plate?
    medium
  126. 126
    Connecting VPCs
    Peering vs Transit Gateway.
    hard
  127. 127
    CloudWatch vs CloudTrail
    Two services people confuse all the time.
    hard
  128. 128
    Spot instances
    When are Spots a good idea?
    hard
  129. 129
    Well-Architected pillars
    Name a pillar that's NOT real.
    hard
  130. 130
    Event-driven on AWS
    S3 → Lambda is a pattern.
    hard
  131. 131
    Pick a region for an EU app
    Latency + residency.
    hard
  132. 132
    AWS cost levers (final)
    Pick the BIGGEST levers.
    hard
  133. 133
    What does CI actually mean?
    Continuous Integration vs Continuous Delivery vs Deployment.
    easy
  134. 134
    Initialize a git repo
    Run `git init` then check the status.
    intro
  135. 135
    Stage and commit a file
    Create README.md, add it, commit it.
    easy
  136. 136
    Trunk-based vs Git Flow
    Pick the model that suits modern CI/CD best.
    easy
  137. 137
    Write a GitHub Actions workflow
    Create .github/workflows/ci.yml with a build job.
    medium
  138. 138
    Order the pipeline stages
    Which order is correct?
    easy
  139. 139
    Build a CI image
    Create a Dockerfile, then `docker build` it.
    medium
  140. 140
    Tag the image with a version
    Use `docker tag` to create a versioned tag.
    medium
  141. 141
    Push to a registry
    `docker push` your image.
    medium
  142. 142
    Where do CI secrets belong?
    Pick the safe option.
    easy
  143. 143
    Why cache dependencies?
    Pick the main reason.
    easy
  144. 144
    Deployment strategies
    Match the description to the strategy.
    medium
  145. 145
    Trigger a Kubernetes rollout
    Update an image, watch the rollout.
    medium
  146. 146
    Rollback a bad deploy
    `kubectl rollout undo`.
    medium
  147. 147
    What is GitOps?
    Pick the best definition.
    hard
  148. 148
    The three pillars of observability
    Logs, metrics, traces.
    easy
  149. 149
    Monitoring vs Observability
    What's the real difference?
    easy
  150. 150
    Scrape a /metrics endpoint
    Use `curl` to hit a Prometheus endpoint.
    easy
  151. 151
    Prometheus metric types
    Counter, gauge, histogram, summary.
    medium
  152. 152
    Read a PromQL query
    What does this query mean?
    medium
  153. 153
    Tail pod logs
    `kubectl logs` is your friend.
    easy
  154. 154
    Describe a pod for events
    `kubectl describe` shows events too.
    easy
  155. 155
    SLI vs SLO vs SLA
    Pick the right definition pair.
    medium
  156. 156
    Error budgets
    What is an error budget?
    medium
  157. 157
    The four golden signals
    From the Google SRE book.
    easy
  158. 158
    Distributed tracing basics
    Spans, traces, context propagation.
    medium
  159. 159
    Structured logs
    JSON > free text.
    easy
  160. 160
    Liveness vs readiness probes
    K8s probes — what's the difference?
    medium
  161. 161
    Test a /healthz endpoint
    `curl -I` for headers only.
    easy
  162. 162
    Alerting on symptoms vs causes
    What should page you at 3 AM?
    hard
  163. 163
    What is a service mesh?
    Pick the best definition.
    medium
  164. 164
    The sidecar pattern
    Where does the proxy run?
    medium
  165. 165
    Why mTLS between services?
    Mutual TLS, not just TLS.
    medium
  166. 166
    Install Istio (simulated)
    Add the Istio Helm repo.
    medium
  167. 167
    Enable sidecar injection on a namespace
    Label the namespace for auto-injection.
    medium
  168. 168
    Traffic splitting / canary
    What primitive does the mesh use?
    hard
  169. 169
    Retries and timeouts in the mesh
    Why move them OUT of the app?
    medium
  170. 170
    Free observability from the mesh
    What do you get for free?
    medium
  171. 171
    AuthorizationPolicy
    Zero-trust between services.
    hard
  172. 172
    Ingress gateway vs Service mesh
    Edge vs east-west.
    medium
  173. 173
    Hit an internal service
    Use `curl` against a ClusterIP DNS name.
    easy
  174. 174
    Linkerd vs Istio (high level)
    When to pick which?
    medium
  175. 175
    Sidecarless / ambient mesh
    Why move away from sidecars?
    hard
  176. 176
    When NOT to use a service mesh
    Tradeoffs are real.
    medium
  177. 177
    Gateway API
    Successor to Ingress.
    hard
  178. 178
    Shared responsibility (revisited)
    Who patches the EC2 OS?
    easy
  179. 179
    Least privilege
    Pick the safer IAM policy.
    easy
  180. 180
    Never commit AWS keys
    What to do INSIDE EC2 / Lambda?
    easy
  181. 181
    Secrets storage
    Where do DB passwords belong?
    easy
  182. 182
    Rotate an access key (drill)
    List, then create — never delete first.
    medium
  183. 183
    MFA on the root user
    Pick the right answer.
    easy
  184. 184
    Encryption at rest
    EBS / RDS / S3.
    medium
  185. 185
    TLS everywhere
    In-transit encryption.
    easy
  186. 186
    CloudTrail = the audit log
    What does CloudTrail log?
    medium
  187. 187
    S3 public bucket disaster
    How do you prevent leaks?
    medium
  188. 188
    Scan a container image
    Why scan images in CI?
    medium
  189. 189
    K8s NetworkPolicy
    Default-deny pod traffic.
    medium
  190. 190
    Pod security: non-root
    securityContext basics.
    medium
  191. 191
    Compliance frameworks 101
    Match the framework to the scope.
    hard
  192. 192
    Incident response basics
    First step on a confirmed breach.
    hard
  193. 193
    What is FinOps?
    Pick the best definition.
    easy
  194. 194
    FinOps phases
    Inform → Optimize → Operate.
    medium
  195. 195
    Cost allocation tags
    Without tags, you're flying blind.
    easy
  196. 196
    Right-sizing
    Pick the data-driven approach.
    medium
  197. 197
    Reserved Instances vs Savings Plans vs Spot
    Match the workload to the pricing model.
    hard
  198. 198
    How much to commit?
    Don't over-commit.
    hard
  199. 199
    S3 storage class lifecycle
    Cold data → cheaper class.
    medium
  200. 200
    Audit your S3 buckets
    Run `aws s3 ls`.
    easy
  201. 201
    Hunt orphaned resources
    What pays AWS for nothing?
    medium
  202. 202
    Watch the egress bill
    Where does data transfer hurt?
    hard
  203. 203
    Why a CDN saves money
    CloudFront / Cloudflare in front of S3.
    medium
  204. 204
    Auto-scale instead of overprovisioning
    Why does autoscaling save money?
    medium
  205. 205
    Turn off dev at night
    Easy 65% saving on non-prod.
    easy
  206. 206
    Budgets and alerts
    Don't get a $40,000 surprise.
    easy
  207. 207
    Unit economics (final boss)
    Cost per WHAT?
    hard