Secondary indexes don't guarantee uniqueness. From the docs:
In a DynamoDB table, each key value must be unique. However, the key values in a global secondary index do not need to be unique.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html#GSI.scenario
Answer from Mike Hornblade on Stack Overflowamazon web services - Uniqueness in DynamoDB secondary index - Stack Overflow
Secondary Index in DynamoDB
DynamoDB - performance difference between querying primary index and global secondary index?
amazon web services - how can I query off a AWS dynamodb table with a secondary index - Stack Overflow
What is the DynamoDB index cost?
What is the difference between primary index and secondary index in DynamoDB?
What is an Inverted index in DynamoDB?
Videos
Secondary indexes don't guarantee uniqueness. From the docs:
In a DynamoDB table, each key value must be unique. However, the key values in a global secondary index do not need to be unique.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html#GSI.scenario
NO they don't. Indexes are updated asynchronously, meaning they'll be eventually consistent, and which also means that dynamodb won't be able to enforce uniqueness at the time when you make the update call (it won't check for uniqueness on the secondary indexes, as that's an async operation; if it does, it will have no way to return a failure, as the real-time call would already have finished).
On a side note, that's also the reason why you can only perform Scan or Query on a GSI index, but not GetItem (i.e. GetItem is expected to return one item, but there can be many corresponding to given secondary index in the absence of uniqueness constraint).
When I look at documentation for querying against a DynamoDB's index, it suggests that you can get single-digit-millisecond latency when querying for a DynamoDB item by its primary key.
What's the latency difference when querying for a DynamoDB item by the partition (or partition+sort) key specified by the table's global secondary index? I assume that the latency would be slightly higher because internally DynamoDB would have to first traverse the GSI's tree to find the primary key of the correct item, and then query for that item by its primary key....but idk whether slightly = 1ms, 10ms, 100ms, etc.
A simple solution to your issue, assuming you do not write more that 1000WCU per second, is to have a static value as your GSI PK. Let's use the value 1 as an example.
| GsiPk | GsiSk | data |
|---|---|---|
| 1 | 2023-03-15T19:00:59.000Z | data |
| 1 | 2023-03-13T15:00:59.000Z | data |
| 1 | 2023-03-11T12:00:59.000Z | data |
const lastweek = '2023-03-10T12:00:59.000Z'
dynamodb.query({
TableName: 'devices',
IndexName: 'LastAccess-index',
KeyConditionExpression: 'GsiPk = :v AND GsiSk > :d',
ExpressionAttributeValues: {
":d": { "S": `"${lastweek}"` },
":v": { "S": "1" },
},
})
As mentioned, be aware of the scalability limitations using this strategy, if you need to, you can shard the GSI PK to handle as much throughput as you need, but you will also have to read for every shard. Have a look at this if you need to.
Such a query would work so much more naturally in an SQL database!
If you have a need to make very frequent queries in a NoSQL DB for "Accesses Last Week" and "Accesses Last Month", it would make sense to run an overnight reindex lambda to update one or more separate attributes to 'this_week', 'last_week', etc. Otherwise, it looks like you are stuck with a scan.