DynamoDB Index Comparison: GSI vs. LSI

DynamoDB Index Comparison: GSI vs. LSI

Comparison: GSI vs. LSI

DynamoDB offers two types of secondary indexes to enhance query : Global Secondary Indexes (GSIs) and Local Secondary Indexes (LSIs). Here’s a detailed comparison:

Key Differences

Feature Global Secondary Index (GSI) Local Secondary Index (LSI)
Partition and Sort Keys Can have a different partition key and sort key than the base table. Has the same partition key as the base table but a different sort key.
Table Creation Can be created at any time (table creation or later). Must be created when the table is created.
Read/Write Capacity Has its own provisioned read/write capacity. Shares the base table’s read/write capacity.
Data Location Stored in a different partition than the base table. Stored in the same partition as the base table.
Read Consistency Eventually consistent reads. Supports strongly consistent reads.
Size Limitations No size limitations beyond the table limits. Limited to 10 GB per base table partition key value.
Number of Indexes Up to 20 per table (soft limit) Up to 5 per table
Flexibility More flexible Less flexible

Benefits of GSIs

  • Flexible Querying: GSIs allow you to query the table using attributes other than the primary key. This enables diverse query patterns.
  • Scalability: GSIs have their own read/write capacity, allowing you to scale read/write operations on the index independently of the base table.
  • Performance: GSIs can significantly improve query performance for non-key attributes, as they are optimized for specific query patterns.
  • Adding indexes to existing tables: You can add GSIs to an existing table.

Benefits of LSIs

  • Strongly Consistent Reads: LSIs support strongly consistent reads, ensuring you get the most up-to-date data.
  • Cost-Effective for Specific Use Cases: LSIs can be more cost-effective than GSIs if your query patterns align with their limitations (same partition key, 10GB size limit).
  • Read performance: LSIs can offer very fast read performance for queries that use the table’s partition key, and an alternate sort key.

Real- Use Cases

Global Secondary Index (GSI)

  • E-commerce Product Catalog:
    • Table: Products (Primary Key: ProductID, Sort Key: SKU)
    • GSI: CategoryBrandIndex (Partition Key: Category, Sort Key: Brand)
    • Use Case: Querying products by category and brand, allowing users to find products like “Electronics” and “Sony.”
    • Code Example ( – boto3):
    
    import boto3
    from boto3.dynamodb.conditions import Key
    
    dynamodb = boto3.resource('dynamodb')
    table = dynamodb.Table('Products')
    
    # Querying using the GSI
    response = table.query(
        IndexName='CategoryBrandIndex',
        KeyConditionExpression=Key('Category').eq('Electronics') & Key('Brand').eq('Sony')
    )
    
    items = response['Items']
    for item in items:
        print(item)
                        
  • Social Media Connections:
    • Table: Connections (Primary Key: UserID1, Sort Key: UserID2)
    • GSI: User1ConnectionDateIndex (Partition Key: UserID1, Sort Key: ConnectionDate)
    • Use Case: Finding all connections for a given user, ordered by the date they were established.
    • Code Example (Python – boto3):
    
    import boto3
    from boto3.dynamodb.conditions import Key
    
    dynamodb = boto3.resource('dynamodb')
    table = dynamodb.Table('Connections')
    
    user_id = "user123"
    response = table.query(
        IndexName='User1ConnectionDateIndex',
        KeyConditionExpression=Key('UserID1').eq(user_id),
        ScanIndexForward=False  # Sort by ConnectionDate descending
    )
    
    connections = response['Items']
    for connection in connections:
        print(connection)
                        
  • Order Management:
    • Table: Orders (Primary Key: OrderID)
    • GSI: CustomerIDOrderDateIndex (Partition Key: CustomerID, Sort Key: OrderDate)
    • Use Case: Retrieving all orders for a specific customer, sorted by the order date.
    • Code Example (Python – boto3):
    
    import boto3
    from boto3.dynamodb.conditions import Key
    
    dynamodb = boto3.resource('dynamodb')
    table = dynamodb.Table('Orders')
    
    customer_id = "cust101"
    response = table.query(
        IndexName='CustomerIDOrderDateIndex',
        KeyConditionExpression=Key('CustomerID').eq(customer_id),
        ScanIndexForward=False
    )
    orders = response['Items']
    for order in orders:
        print(order)
                        
  • Game Leaderboards:
    • Table: GameScores (Primary Key: GameID, Sort Key: Score)
    • GSI: PlayerIDScoreIndex (Partition Key: PlayerID, Sort Key: Score)
    • Use Case: Fetching the leaderboard for a specific player, showing their scores across different games.
    • Code Example (Python – boto3):
    
    import boto3
    from boto3.dynamodb.conditions import Key
    
    dynamodb = boto3.resource('dynamodb')
    table = dynamodb.Table('GameScores')
    
    player_id = "player456"
    response = table.query(
        IndexName='PlayerIDScoreIndex',
        KeyConditionExpression=Key('PlayerID').eq(player_id),
        ScanIndexForward=False  # Sort by Score descending
    )
    
    scores = response['Items']
    for score in scores:
        print(score)
                        
  • Local Secondary Index (LSI)

    • Game Scores within a User:
      • Table: GameScores (Primary Key: UserID, Sort Key: Timestamp)
      • LSI: LevelIndex (Sort Key: Level)
      • Use Case: Retrieving a user’s game scores, sorted by level. All scores for a single user are stored in the same partition, making LSI suitable.
      • Code Example (Python – boto3):
      
      import boto3
      from boto3.dynamodb.conditions import Key
      
      dynamodb = boto3.resource('dynamodb')
      table = dynamodb.Table('GameScores')
      
      user_id = "user789"
      response = table.query(
          KeyConditionExpression=Key('UserID').eq(user_id),  # Query by primary key
          IndexName='LevelIndex',  # Use the LSI
          ScanIndexForward=True  # Sort by Level ascending
      )
      scores = response['Items']
      for score in scores:
          print(score)
                          
  • Event Logs for a Device:
    • Table: DeviceEvents (Primary Key: DeviceID, Sort Key: Timestamp)
    • LSI: EventTypeIndex (Sort Key: EventType)
    • Use Case: Fetching events for a specific device, sorted by event type.
    • Code Example (Python – boto3):
    
    import boto3
    from boto3.dynamodb.conditions import Key
    
    dynamodb = boto3.resource('dynamodb')
    table = dynamodb.Table('DeviceEvents')
    
    device_id = "device001"
    response = table.query(
        KeyConditionExpression=Key('DeviceID').eq(device_id),
        IndexName='EventTypeIndex',
        KeyConditionExpression=Key('DeviceID').eq(device_id) #AND  Removed this condition.
    )
    events = response['Items']
    for event in events:
        print(event)
                        
  • E-commerce Order History within a Date Range:
    • Table: Orders (Primary Key: CustomerID, Sort Key: OrderDate)
    • LSI: OrderStatusIndex (Sort Key: OrderStatus)
    • Use Case: Retrieving a customer’s orders, filtered by order status, within a specific date range.
    • Code Example (Python – boto3):
    
    import boto3
    from boto3.dynamodb.conditions import Key
    
    dynamodb = boto3.resource('dynamodb')
    table = dynamodb.Table('Orders')
    
    customer_id = "customer555"
    response = table.query(
        KeyConditionExpression=Key('CustomerID').eq(customer_id),
        IndexName='OrderStatusIndex',
        KeyConditionExpression=Key('CustomerID').eq(customer_id) #AND  Removed this condition.
    )
    orders = response['Items']
    for order in orders:
        print(order)
                        
  • By understanding the differences and use cases of GSIs and LSIs, you can effectively your DynamoDB tables and optimize your queries for performance and scalability.

    Agentic AI (9) AI (178) AI Agent (21) airflow (4) Algorithm (36) Algorithms (31) apache (41) API (108) Automation (11) Autonomous (26) auto scaling (3) AWS (30) Azure (22) BigQuery (18) bigtable (3) Career (7) Chatbot (21) cloud (87) cosmosdb (1) cpu (24) database (82) Databricks (13) Data structure (17) Design (76) dynamodb (4) ELK (1) embeddings (14) emr (4) flink (10) gcp (16) Generative AI (8) gpu (11) graphql (4) image (6) index (10) indexing (12) interview (6) java (39) json (54) Kafka (19) Life (43) LLM (25) LLMs (10) Mcp (2) monitoring (55) Monolith (6) N8n (12) Networking (14) NLU (2) node.js (9) Nodejs (6) nosql (14) Optimization (38) performance (54) Platform (87) Platforms (57) postgres (17) productivity (7) programming (17) pseudo code (1) python (55) RAG (132) rasa (3) rdbms (2) ReactJS (2) realtime (1) redis (6) Restful (6) rust (6) Spark (27) sql (43) time series (6) tips (1) tricks (13) Trie (62) vector (22) Vertex AI (11) Workflow (52)

    Leave a Reply

    Your email address will not be published. Required fields are marked *