Azure Cosmos DB offers two main types of indexes to optimize query performance: Global Secondary Indexes (GSIs) and Local Secondary Indexes (LSIs). This article provides a detailed comparison.
Key Differences
Feature | Global Secondary Index (GSI) | Local Secondary Index (LSI) |
---|---|---|
Partition Key | Can be different from the base container’s partition key | Must be the same as the base container’s partition key |
Sort Key | Can be different from the base container’s sort key | Can be different from the base container’s sort key |
Provisioned Throughput | Has its own provisioned throughput | Shares the base container’s provisioned throughput |
Storage | Stored separately from the base container | Stored within the same partition as the base container |
Query Scope | Can query data across partitions | Can only query data within the same partition |
Creation | Can be created at any time | Must be created when the container is created |
Consistency | Supports eventual consistency | Supports strong consistency within the partition |
Use Cases | Best for queries that span multiple partitions or use a different partition key | Best for queries within a single partition that use a different sort key |
Number of Indexes | Limited to 500 per container | Limited to 5 per container partition key value |
Benefits of Global Secondary Indexes (GSIs)
- Flexible querying: GSIs allow you to query data using attributes other than the primary key, enabling diverse query patterns.
- Scalability: GSIs have their own provisioned throughput, allowing you to scale read/write operations on the index independently of the base container.
- Performance: GSIs can significantly improve query performance for non-key attributes, as they are optimized for specific query patterns.
- Schema flexibility: You can add GSIs to an existing container without having to recreate it.
Benefits of Local Secondary Indexes (LSIs)
- Strong consistency: LSIs support strong consistency within the partition, ensuring you get the most up-to-date data for queries within the same partition.
- Cost-effective for specific use cases: LSIs can be more cost-effective than GSIs if your query patterns align with their limitations (same partition key).
- Performance: LSIs can offer very fast read performance for queries that use the container’s partition key and an alternate sort key.
Real-Life Use Cases
Global Secondary Index (GSI)
-
E-commerce product catalog:
- Container:
Products
(Partition Key:categoryId
, Sort Key:productId
) - GSI:
name-price-index
(Partition Key:name
, Sort Key:price
) - Use case: Querying products by name and price, allowing users to find products like “Electronics” sorted by price.
- Code example (Azure CLI):
az cosmosdb sql container create --account-name <account_name> --database-name <database_name> --name Products --partition-key-path "/categoryId" az cosmosdb sql container create --account-name <account_name> --database-name <database_name> --container-name Products --index-policy '{"includedPaths": [{"path": "/*"}], "excludedPaths": [{"path": "/\"_etag\"/?"}], "compositeIndexes": [], "spatialIndexes": []}' az cosmosdb sql container create --account-name <account_name> --database-name <database_name> --container-name Products --gsi-name "name-price-index" --partition-key-path "/name" --sort-key-path "/price"
- Container:
Social media connections:
- Container:
Connections
(Partition Key:userId1
, Sort Key:userId2
) - GSI:
user1-connectionDate-index
(Partition Key:userId1
, Sort Key:connectionDate
) - Use case: Finding all connections for a given user, ordered by the date they were established.
- Code Example (Azure CLI):
az cosmosdb sql container create --account-name <account_name> --database-name <database_name> --name Connections --partition-key-path "/userId1"
az cosmosdb sql container create --account-name <account_name> --database-name <database_name> --container-name Connections --index-policy '{"includedPaths": [{"path": "/*"}], "excludedPaths": [{"path": "/\"_etag\"/?"}], "compositeIndexes": [], "spatialIndexes": []}'
az cosmosdb sql container create --account-name <account_name> --database-name <database_name> --container-name Connections --gsi-name "user1-connectionDate-index" --partition-key-path "/userId1" --sort-key-path "/connectionDate"
Order management:
- Container:
Orders
(Partition Key:customerId
, Sort Key:orderId
) - GSI:
customer-orderDate-index
(Partition Key:customerId
, Sort Key:orderDate
) - Use case: Retrieving all orders for a specific customer, sorted by the order date.
- Code Example (Azure CLI):
az cosmosdb sql container create --account-name <account_name> --database-name <database_name> --name Orders --partition-key-path "/customerId"
az cosmosdb sql container create --account-name <account_name> --database-name <database_name> --container-name Orders --index-policy '{"includedPaths": [{"path": "/*"}], "excludedPaths": [{"path": "/\"_etag\"/?"}], "compositeIndexes": [], "spatialIndexes": []}'
az cosmosdb sql container create --account-name <account_name> --database-name <database_name> --container-name Orders --gsi-name "customer-orderDate-index" --partition-key-path "/customerId" --sort-key-path "/orderDate"
Game leaderboards:
- Container:
GameScores
(Partition Key:gameId
, Sort Key:score
) - GSI:
playerId-score-index
(Partition Key:playerId
, Sort Key:score
) - Use Case: Fetching the leaderboard for a specific player, showing their scores across different games.
- Code Example (Azure CLI):
az cosmosdb sql container create --account-name <account_name> --database-name <database_name> --name GameScores --partition-key-path "/gameId"
az cosmosdb sql container create --account-name <account_name> --database-name <database_name> --container-name GameScores --index-policy '{"includedPaths": [{"path": "/*"}], "excludedPaths": [{"path": "/\"_etag\"/?"}], "compositeIndexes": [], "spatialIndexes": []}'
az cosmosdb sql container create --account-name <account_name> --database-name <database_name> --container-name GameScores --gsi-name "playerId-score-index" --partition-key-path "/playerId" --sort-key-path "/score"
Local Secondary Index (LSI)
-
Device events within a time range:
- Container:
DeviceEvents
(Partition Key:deviceId
, Sort Key:timestamp
) - LSI:
event-type-index
(Sort Key:eventType
) - Use case: Retrieving events for a specific device, sorted by event type.
- Code example (Azure CLI):
az cosmosdb sql container create --account-name <account_name> --database-name <database_name> --name DeviceEvents --partition-key-path "/deviceId" --sort-key-path "/timestamp" --lsi-name "event-type-index" --lsi-sort-key-path "/eventType"
- Container:
E-commerce order history within a date range:
- Container:
Orders
(Partition Key:customerId
, Sort Key:orderDate
) - LSI:
orderStatus-index
(Sort Key:orderStatus
) - Use Case: Retrieving a customer’s orders, filtered by order status, within a specific date range.
- Code example (Azure CLI):
az cosmosdb sql container create --account-name <account_name> --database-name <database_name> --name Orders --partition-key-path "/customerId" --sort-key-path "/orderDate" --lsi-name "orderStatus-index" --lsi-sort-key-path "/orderStatus"
Choosing the Right Index
- Use GSIs when your queries need to retrieve data from multiple partitions or use a different partition key than the base container.
- Use LSIs when your queries are limited to a single partition but need to use a different sort key.
By understanding the differences and use cases of GSIs and LSIs, you can effectively design your Azure Cosmos DB containers and optimize your queries for performance and scalability.
Leave a Reply