AWS Fundamentals: RDS + Aurora + ElastiCache


I.  Amazon RDS Overview

  • RDS stands for Relational Database Services
  • It's managed Database Service for DB use SQL as s Query Language
  • It allows you to create Databases in the cloud that managed byAWS
    • Postgres
    • MySQL
    • Maria DB
    • Oracle
    • Microsoft SQL server
    • Aurora (AWS proprietary database) 
Advantage over using RDS versus deploying DB on EC2
  • RDS is a managed service:
    • Automated provisioning, OS patching
    • Continuous backups and restore to specific timestamp (Point in Time Restore)!
    • Monitoring dashboards
    • Read replicas for improved read performance
    • Multi AZ setup for DR (Disaster Recovery)
    • Maintenance windows for upgrades
    • Scaling capability (verical and horizontal)
    • Storage backed by EBS (gp2 or io1)
  • BUT you can't SSH ino your instances
RDS - Storage Auto Scaling
  • Helps you increase storage on your RDS DB instance dynamically
  • When RDS detects you are running out of free database storage, it scales automatically
  • Avoid manually scaling your database storage
  • You have to set Maximun Storage Threshold (maximum limit for DB storage)
  • Automatically modify storage if:
    • Free storage is less than 10% of allocated storage
    • Low-storage lasts at leat 5 minutes
    • 6 hours have passed since last modification
  • Useful for applications with unpredictable workloads
  • Support all RDS database engines (MariaDB, MySQL, PostgreSQL, SQL server, Oracle)
II. RDS read replicas vs Multi AZ
1. RDS Read Replicas for read scalability
  • Up to 15 Read Replicas
  • Within AZ, Cross AZ or Cross Region
  • Replication is Async, so reads are eventually consistent
  • Replicas can be promoted to their own DB
  • Applications must update the connection string to leverage read replicas

2. Use cases
  • You have a production Database that is taking on normal load
  • You want to run a report application to some analytics
  • You create a Read Replica to run the new workload there
  • The production application is unaffected
  • Read replicas are used for SELECT (=read) only kind of statements (not INSERT, UPDATE, DELETE)
3. RDS Read Replicas - Network Cost
  • In AWS there's a network cost when data goes from one AZ to other
  • For RDS Read Replicas within the same region, you don't pay that free
4. RDS Multi AZ (Disaster Recovery)
  • Sync replication
  • One DNS name - automatic app failover to standby
  • Increase availability
  • Failover in case of loss of AZ, loss of network, instance or storage failure
  • No manual intervention in apps 
  • Not used for scaling  
  • Note: The Read Replicas be setup as Multi AZ for Disaster Recovery (DR)
5. RDS - From Single-AZ to Multi-AZ
  • Zero downtime operation (no need to stop the DB)
  • Just click on "modify" for the database
  • The following happens internally:
    • A snapshot is taken
    • A new DB is restored for the snapshot in new AZ
    • Synchronization is established between the two databases

III.  Amazon RDS Hands on

Setup in free tier








Delete 
modify this

    IV.  Amazon Aurora

    • Aurora is a proprietary technolochy from AWS (not open source)
    • Postges and MySQL are both supported as Aurora DB(that means your drivers will work as if Aurora was a Postgres or MySQL database)
    • Aurora is "AWS cloud optimized) and claims 5x performance improvement over MySQL on RDS, over 3x the performance of Postgres on RDS
    • Aurora storage automatically grows in increments of 10GB, up to 128TB
    • Aurora can have up to 15 replicas and replication process is faster than MySQL (sub 10ms replica lag)
    • Failover in Aurora is instantaneous, It's HA native
    • Aurora costs more than RDS (20% more) - but is more efficient
    Aurora hight availability and read scaling 
    • 6 copies of your data across 3 AZ:
      • 4 copies out of 6 need for writes
      • 3 copies out of 6 need for reads
      • Self healing with peer-to-peer replication
      • Storage  is striped across 100s of volums
    • One Arora instance takes writes (master)
    • Automatic failover for master in less than 30seconds
    • Master + upto 15 Aurora Read Replicas serve reads
    • Support for Cross Region Replication
    Aurora DB Cluster
    Features of Aurora
    • Automatic fail-over
    • Backup and recovery
    • Isolation and security
    • Industry compliance
    • Push-button scaling
    • Automated patching with Zero Downtime
    • Advanced Monitoring
    • Routine Maintenance
    • Backtrack: restore data at any point of time without using backups

    V.  Amazon Aurora - Hands on

    Similar to other RDS

    VI.  RDS & Aurora Security

    • At-rest encryption:
      • Database master & replicas encryption using AWS KMS - must be defined as lunch time
      • If the master is not encrypted, the read replicas can not encrypted
      • To encrypt an uncrypted database, go through an DB snapshot & restore as encrypted
    • In-flight encryption: TLS-ready by default, use the AWS TSL root certificates client-side
    • IAM authentication: IAM roles to connect your database (instead of username/pw)
    • Security Groups: Control network access to your RDS / Aurora DB
    • No SSH available except on RDS Custom
    • Audit Logs can be enabled and sent to CloudWatch Logs for longer retention

    VII.  RDS Proxy

    • Fully managed database proxy for RDS
    • Allows apps to pool and share DB connections established with database
    • Improve database efficiency by reducing the stress on database resources (e.g., CPU, RAM) and minize open connections (and timeouts)
    • Serverless, auto scaling, hightly available (multi AZ)
    • Reduced RDS & Aurora failover time by up 66%
    • Supports RDS (MySQL, PostgreSQL, MariaDB, MS SQL Server) and Aurora (MySQL, PostgresSQL)
    • No code changes required for most apps
    • Enforce IAM Authentication for DB, and securely store credentials is AWS Secrets Manager
    • RDS Proxy is never publicly accessible (must be accessed from VPC)


        VIII.  ElasticCache Overview

        • The same way RDS is to get managed Relational Databasees...
        • ElastiCache is to get managed Redis or Memcached
        • Caches are in-memory databases with really high performance, low latency
        • Helps reduce load off of databases for read intensive workloads
        • Help make your application stateless
        • AWS takes care of OS maintenance / patching, optimizations, setup, configuration, monitoring, failure recovery and backups
        • Using ElasticCache involves heavy application code changes
        ElasticCache - Solution Architecture - DB Cache
        • Applications queries ElastiCache, if not available, get from RDS and store in ElastiCache
        • Helps relieve load in RDS
        • Cache must have an invalidation strategy to make sure only the most current data is used in there
        ElasticCache - Solution Architecture - User Sesstion Store
          • User logs into any of the application
          • The application writes the session data into ElastiCache
          • The user hits another instance of our application 
          • The instance retrieves the data and the user is ready to logged in
          ElasticCache - Redis vs Memcached

          IX.  ElasticCache - Hands on

          Severless



          Design your own cache






          X.  ElasticCache - Strategies

          1. Caching Implementation Considerations

          • AWS documentation: Caching Best Practices
          • Is it safe to cache data? Data may be out of date, eventually consistent
          • Is caching effective for that data?
            • Pattern: data change slowly, few keys are frequently needed
            • Anti patterns: data changing rapidly, all large key space frequently needed
          • Is data structured well for caching?
            • example: key value caching, or caching of aggregations results
          • Which caching design pattern is the most appropriate?
          2. Lazy Loading / Cache-Aside / Lazy Population

          • Pros:
            • Only requested data is cached ( the cache isn't filled up with unused data)
            • Node failures are not fatal (just increased latency to warm the cache. It means that all the reads have to go to RDS and then be cached)
          • Cons:
            • Cache miss penalty that results in 3 round trips, noticeable delay for that request
            • Stale data: data can be updated in the database and outdated in the cache
          3. Write Through - Add or Update cache when database is updated
          • Pros:
            • Data in cache is never stale, reads are quick
            • Write penalty vs Read penalty (each write requires 2 calls)
          • Cons: 
            • Missing Data until it is added/ updated in the DB. Mitigation is to implement Lazy Loading strategy as well
            • Cache churn - a lot of the data will never read
          4. Cache Evictions and Time-to-live (TTL)
          • Cache eviction can occur in three ways:
            • you delete the item explicitly in the cache
            • Item is evicted because the memory is full and it's not recently used (least recently used LRU)
            • You set an item time-to-live (TTL)
          • TTL helpfull for any kind of data
            • Leaderboards
            • Comments
            • Activity streams
          • TTL can range from few seconds to hours or days
          • If to many evictions happen due to memory, you should scale up or out
          5. Final words of wisdom
          • Lazy loading / Cache aside is easy to implement and works for many situations as a foundation, especially on the read side
          • Write-through is usually combined with Lazy loading as targeted for the queries or workloads that benefit from this optimization
          • Setting a TTL is usually not a bad idea, except when you are using Write-through. Set it to a sensible value for your application
          • Only cache the data that makes sense (user profiles, blogs, etc, ...)
          • Quote: There are only two hard things in Computer Science: cache invalidation and naming things

          XI. Amazon MemoryDB for Redis - Overview

          • Redis-compatible, durable, in memory database service
          • Ultra-fast performance with over 160millions RPS (request per second)
          • Durable in-memory data storage with Multi-AZ transactional log
          • Scale seamlessly from 10s GBs to 100s TBs of storage
          • Use cases: web and mobile apps, online gaming, media streaming, ...



              Comments

              Popular posts from this blog

              IAM & AWS CLI

              EC2 Fundamentals

              AWS Fundamentals: ELB + ASG