amazon web services - DynamoDB create index on map or list type - Stack Overflow
How to create an item with an index in DynamoDB?
amazon web services - Can you add a global secondary index to dynamodb after table has been created? - Stack Overflow
Suggestion for DynamoDB table index usage
What is the DynamoDB index cost?
What is a Sparse index in DynamoDB?
What is an Inverted index in DynamoDB?
Videos
Indexes can be built only on top-level JSON attributes. In addition, range keys must be scalar values in DynamoDB (one of String, Number, Binary, or Boolean).
From https://docs.aws.amazon.com/whitepapers/latest/comparing-dynamodb-and-hbase-for-nosql/indexing.html:
Q: Is querying JSON data in DynamoDB any different?
No. You can create a Global Secondary Index or Local Secondary Index on any top-level JSON element. For example, suppose you stored a JSON document that contained the following information about a person: First Name, Last Name, Zip Code, and a list of all of their friends. First Name, Last Name and Zip code would be top-level JSON elements. You could create an index to let you query based on First Name, Last Name, or Zip Code. The list of friends is not a top-level element, therefore you cannot index the list of friends. For more information on Global Secondary Indexing and its query capabilities, see the Secondary Indexes section in this FAQ.
Q: What data types can be indexed?
All scalar data types (Number, String, Binary, and Boolean) can be used for the range key element of the local secondary index key. Set, list, and map types cannot be indexed.
I have tried doing hash(str(object)) while I store the object separately. This hash gives me an integer(Number) and I am able to use a secondary index on it. Below is a sample in python, it is important to use a hash function which generates the same hash key every time for the value. So I am using sha1.
# Generate a small integer hash:
import hashlib
def hash_8_digits(source):
return int(hashlib.sha1(source.encode()).hexdigest(), 16) % (10 ** 8)
The idea is to keep the entire object small while still the entity intact. i.e. rather than serializing and storing the object as string and changing whole way the object is used I am storing a smaller hash value along with the actual list or map.
You've got some fundamental misunderstanding going on. You don't give enough code or examples for me to guess what you're really attempting. For example, I don't know what your table's keys are. So here's a primer:
You only write items to the base table (never directly to an index). Items can have a variety of attributes. Each item must have unique key attributes in the base table.
You can create a GSI against the table, including after the table has data. When constructing the GSI you select what its key attributes will be.
When you want to use the GSI you must specify it in the query as your Scan or Query target.
Are you trying to write to the index? You can't.
Are you trying to query the index by pointing at the base table? You can't.
Are you trying to write an item to the base table without specifying its primary keys? You can't.
How to create an item with an index in DynamoDB?
You can not create an item without an index in DynamoDB.
When you create a table, you specify the Primary Key which is your index.
When you add an item, you have to provide the Primary Key.
You can also make use of Global Secondary Indexes which technically create a new table with that index under the hood.
But what ended up happening is date and timestamp were simply added as normal attributes that aren't able to be queried.
If you want to be able to query an attribute, that attribute has to be a Primary Key (Partition or Composite) or a Global Secondary Index.
Edit (January 2015):
Yes, you can add a global secondary index to a DynamoDB table after its creation; see here, under "Global Secondary Indexes on the Fly".
Old Answer (no longer strictly correct):
No, the hash key, range key, and indexes of the table cannot be modified after the table has been created. You can easily add elements that are not hash keys, range keys, or indexed elements after table creation, though.
From the UpdateTable API docs:
You cannot add, modify or delete indexes using UpdateTable. Indexes can only be defined at table creation time.
To the extent possible, you should really try to anticipate current and future query requirements and design the table and indexes accordingly.
You could always migrate the data to a new table if need be.
Just got an email from Amazon:
Dear Amazon DynamoDB Customer,
Global Secondary Indexes (GSI) enable you to perform more efficient queries. Now, you can add or delete GSIs from your table at any time, instead of just during table creation. GSIs can be added via the DynamoDB console or a simple API call. While the GSI is being added or deleted, the DynamoDB table can still handle live traffic and provide continuous service at the provisioned throughput level. To learn more about Online Indexing, please read our blog or visit the documentation page for more technical and operational details.
If you have any questions or feedback about Online Indexing, please email us.
Sincerely, The Amazon DynamoDB Team
Hi, I have loaded a table with the following structure: ID, Name, last name, location, score and some other attributes not relevant to the case.
The problem is to create the most efficient table in terms of cost and reading speed (not many writes will be done to this table). Also, it is expected that the table will be queried several times against the attributes I mentioned earlier. Most likely with one of those or a combination of many of them (e.g., name + last name + location).
In the beginning, I thought it would be good if the ID is the partition key and then create global secondary indexes for each one of the other attributes. However, now that I have loaded the data (10gb) I think I'm going to murder the project's budget with that approach.
Can you suggest me a better way to achieve this please?
- DynamoDB is not designed to optimize indexing on set values. Below is a copy of the amazon's relevant documentation (from Improving Data Access with Secondary Indexes in DynamoDB).
The key schema for the index. Every attribute in the index key schema must be a top-level attribute of type String, Number, or Binary. Nested attributes and multi-valued sets are not allowed. Other requirements for the key schema depend on the type of index: For a global secondary index, the hash attribute can be any scalar table attribute. A range attribute is optional, and it too can be any scalar table attribute. For a local secondary index, the hash attribute must be the same as the table's hash attribute, and the range attribute must be a non-key table attribute.
- Amazon recommends creating a separate one-to-many table for these kind of problems. More info here : Use one to many tables
This is a really old post, sorry to revive it, but I'd take a look at "Single Table Design"
Basically, stop thinking about your data as structured data - embrace denormalization
id (Number - primary key ) title (String) created_at (Number - long) tags (StringSet - contains a set of tags say android, ios, etc.,)
Instead of a nosql table with a "header" of this:
id|title|created_at|tags
think of it like this:
pk|sk |data....
id|id |{title, created_at}
id|id+tag|{id, tag} <- create one record per tag
You can still return everything by querying for pk=id & sk begins with id and join the tags to the id records in your app logic
and you can use a GSI to project id|id+tag into tag|id which will still require you to write two queries against your data to get items of a given tag (get the ids then get the items), but you won't have to duplicate your data, you wont have to scan and you'll still be able to get your items in one query when your access pattern doesn't rely on tags.
FWIW I'd start by thinking about all of your access patterns, and from there think about how you can structure composite keys and/or GSIs
cheers