Author: Admin

  • Medallion Architecture

    The Medallion Architecture is a data lakehouse architecture pattern popularized by Databricks. It’s designed to progressively refine data through a series of layers, ensuring data quality and suitability for various downstream consumption needs. The name “Medallion” refers to the distinct quality levels achieved at each layer, similar to how medals signify different levels of achievement.… Read more

  • Data Lake vs. Data Lakehouse: Understanding Modern Data Architectures

    Organizations today grapple with ever-increasing volumes and varieties of data. To effectively store, manage, and analyze this data, different architectural approaches have emerged. Two prominent concepts in this landscape are the data lake and the data lakehouse. While both aim to provide a centralized data repository, they differ significantly in their design principles and capabilities.… Read more

  • Loading documents into OpenSearch for vector search

    Here’s how you can load documents into OpenSearch for vector search: 1. Create a k-NN Index First, you need to create an index in OpenSearch that is configured for k-Nearest Neighbors (k-NN) search. This involves setting index.knn to true and defining the field that will store your vector embeddings as type knn_vector. You also need… Read more

  • k-NN (k-Nearest Neighbors) search in OpenSearch

    To perform a k-NN (k-Nearest Neighbors) search in OpenSearch after loading your manuals (or any documents) as vector embeddings, you’ll use the knn query within the OpenSearch search API. Here’s how you can do it: Understanding the knn Query The knn query in OpenSearch allows you to find the k most similar vectors to a… Read more

  • Loading manuals into a vector database

    Here’s a breakdown of how to load manuals into a vector database, focusing on the key steps and considerations: 1. Choose a Vector Database: Several vector databases are available, each with its own strengths and weaknesses.1 Some popular options include: Consider factors like scalability, ease of use, cost, integration with your existing stack, and specific… Read more

  • Building a Product Manual Chatbot with Amazon OpenSearch and Open-Source LLMs

    This article guides you through building an intelligent chatbot that can answer questions based on your product manuals, leveraging the power of Amazon OpenSearch for semantic search and open-source Large Language Models (LLMs) for generating informative responses. This approach provides a cost-effective and customizable solution without relying on Amazon Bedrock. The Challenge: Navigating through lengthy… Read more

  • Integrating Documentum with an Amazon Bedrock Chatbot API for Product Manuals

    This article outlines the process of building a product manual chatbot API using Amazon Bedrock, with a specific focus on integrating content sourced from a Documentum repository. By leveraging the power of vector embeddings and Large Language Models (LLMs) within Bedrock, we can create an intelligent and accessible way for users to find information within… Read more

  • Distinguish the use cases for the primary vector database options on AWS:

    Here we try to distinguish the use cases for the primary vector database options on AWS: 1. Amazon OpenSearch Service (with Vector Engine): 2. Amazon Bedrock Knowledge Bases (with underlying vector store choices): 3. Amazon Aurora PostgreSQL/RDS for PostgreSQL (with pgvector): 4. Amazon Neptune Analytics (with Vector Search): 5. Vector Search for Amazon MemoryDB for… Read more

  • Scaling a vector database

    Scaling a vector database is a crucial consideration as your data grows and your query demands increase. Here’s a breakdown of the common strategies and factors involved in scaling vector databases: Why Scaling is Important: Common Scaling Strategies: Techniques for Horizontal Scaling: Factors to Consider When Scaling: Choosing the Right Scaling Strategy: The best scaling… Read more

  • Language Models vs Embedding Models

    In the ever-evolving landscape of Artificial Intelligence, two types of models stand out as fundamental building blocks for a vast array of applications: Language Models (LLMs) and Embedding Models. While both deal with text, their core functions, outputs, and applications differ significantly. Understanding these distinctions is crucial for anyone venturing into the world of natural… Read more