Videos
What is the difference between Amazon RDS and DynamoDB?
Is DynamoDB a SQL or NoSQL database?
Is Redshift a NoSQL database?
I currently have a usecase where API gateway writes processing jobs to an SQS queue, an ASG scales up and workers pull, process, and delete messages from queue and write to some output queue. I needed to use ASG here since we're processing with an older legacy app which needs Windows and high-CPU (Fargate wouldn't work here).
As the system has started getting much more use, I wanted to log metadata somewhere. Just general things like process ID, input parameters, timestamps, and user info.
I already implemented thus with DynamoDB since it's fast and serverless, and have a lambda on both pre and post processing queues that wrote metadata to the table. For now DynamoDB was very quick to get going, and the only negative I see is that it'll be harder to change the partition and sort keys if we decide to down the line.
However, it got me thinking when should I lean towards RDS and when should I lean towards DynamoDB? Especially if a coworker asks for justification for either architecture, it would be good to weigh the pros ans cons.
Which service fits the most here?
imho you could do a PoC to see which service is more feasible for you. It really depends on how much data you have, what queries and what load you plan to execute.
AWS Redshift is intended for OLAP on top of peta- or exa-bytes scale handling heavy parallel workload. RS can as well aggregate data from other data sources (jdbc, s3,..). However RS is not OLTP, it requires more static server overhead and extra skills for managing the deployment.
So without more numbers and use cases one cannot advice anything. Cloud is great that you can try and see what fits you.
AWS Redshift is really great when you only want to read the data from the database. Basically, Redshift in the backend is a column-oriented database that is more suitable for analytics. You can transfer all your existing data to redshift using the AWS DMS. AWS DMS is a service that basically needs your bin logs of the existing database and it will automatically transfer your data we don't have to do anything. From my Personal experience Redshift is really great.
Redshift is a data warehouse and generally used for OLAP(analytical) processes. Analytical DBs are too slow for transactional processes and do not generally obey primary key foreign key constraints. While Aurora and DynamoDB are OLTP(transactional) database. In your case if you are to keep all the data in a single JSON entry it would be better to use DynamoDB but I would suggest to use Aurora as it is a RDBMS having fix schema, but you will have to keep multiple entries per user in another table although retrieving them will be just a single join query.
Redshift will not meet your needs. It's an OLAP database designed to scan huge amounts of data in parallel. As a very basic example, you might export your live database to Redshift and query it to see if any players have an extreme amount of money or lots of duplicate items, and look for cheaters that way. It's terrible at querying and updating single records.
Aurora and DynamoDB are both OLTP databases that are designed to handle tasks just like you have in mind. From personal experience I can say that Aurora would have no trouble scaling up: I work with a mid-range Aurora instance that consistently provides ~2500 QPS over multiple billion-record tables. If anything, DynamoDB is more scalable than Aurora at a similar price point, so I wouldn't worry about scaling. :)
For the simple schema you describe, there isn't a hugely compelling reason to choose one of Aurora or DynamoDB over the other. AWS has serverless Aurora in preview, which would be the lowest-cost choice for light usage if it were available right now. Perhaps use a t2.small with Aurora for now and migrate to serverless when you can? DynamoDB is also quite cheap at the low end, though.