Tag: Chatbot

  • k-NN (k-Nearest Neighbors) search in OpenSearch

    To perform a k-NN (k-Nearest Neighbors) search in OpenSearch after loading your manuals (or any documents) as vector embeddings, you’ll use the knn query within the OpenSearch search . Here’s how you can do it:

    Understanding the knn Query

    The knn query in OpenSearch allows you to find the k most similar vectors to a query vector based on a defined distance metric (like Euclidean distance or cosine similarity).

    Steps to Perform a k-NN Search:

    1. Identify the Vector Field: You need to know the name of the field in your OpenSearch index that contains the vector embeddings of your manual chunks (e.g., "embedding" as used in the previous examples).
    2. Construct the Search Query: You’ll create a JSON request to the OpenSearch _search endpoint, using the knn query type.
    3. Specify the Query Vector: Within the knn query, you’ll provide the vector you want to find similar vectors to. This query vector should have the same dimensionality as the vectors in your index. You’ll likely generate this query vector by embedding the user’s search query using the same embedding model you used for your manuals.
    4. Define k: You need to specify the number of nearest neighbors (k) you want OpenSearch to return.

    Example using the OpenSearch Client:

    Assuming you have the OpenSearch Python client initialized (os_client) as in the previous code snippets, here’s how you can perform a k-NN search:

    Python

    def perform_knn_search(index_name, query_vector, k=3):
        """
        Performs a k-NN search on the specified OpenSearch index.
    
        Args:
            index_name (str): The name of the OpenSearch index.
            query_vector (list): The vector to search for nearest neighbors of.
            k (int): The number of nearest neighbors to return.
    
        Returns:
            list: A list of the top k matching documents (hits).
        """
        search_query = {
            "size": k,  # Limit the number of results to k (can be different from k in knn)
            "query": {
                "knn": {
                    "embedding": {  # Replace "embedding" with the actual name of your vector field
                        "vector": query_vector,
                        "k": k
                    }
                }
            }
        }
    
        try:
            response = os_client.search(index=index_name, body=search_query)
            hits = response['hits']['hits']
            return hits
        except Exception as e:
            print(f"Error performing k-NN search: {e}")
            return []
    
    # --- Example Usage ---
    if __name__ == "__main__":
        # Assuming you have a user query
        user_query = "How do I troubleshoot a connection issue?"
    
        # Generate the embedding for the user query using the same model
        from transformers import AutoTokenizer, AutoModel
        embedding_model_name = "sentence-transformers/all-mpnet-base-v2"
        embedding_tokenizer = AutoTokenizer.from_pretrained(embedding_model_name)
        embedding_model = AutoModel.from_pretrained(embedding_model_name)
    
        def get_query_embedding(text, tokenizer, model):
            inputs = tokenizer(text, padding=True, truncation=True, return_tensors="pt")
            outputs = model(**inputs)
            return outputs.last_hidden_state.mean(dim=1).detach().numpy().tolist()[0]
    
        query_embedding = get_query_embedding(user_query, embedding_tokenizer, embedding_model)
    
        # Perform the k-NN search
        search_results = perform_knn_search(OPENSEARCH_INDEX_NAME, query_embedding, k=3)
    
        if search_results:
            print(f"Top {len(search_results)} most relevant manual snippets for query: '{user_query}'")
            for hit in search_results:
                print(f"  Score: {hit['_score']}")
                print(f"  Content: {hit['_source']['content'][:200]}...") # Display first 200 characters
                print("-" * 20)
        else:
            print("No relevant manual snippets found.")
    

    Explanation of the Code:

    1. perform_knn_search Function:
      • Takes the index_name, query_vector, and the desired number of neighbors k as input.
      • Constructs the OpenSearch search query with the knn clause.
      • The vector field within knn specifies the query vector.
      • The k field within knn specifies the number of nearest neighbors to retrieve.
      • The size parameter in the top-level query controls the total number of hits returned by the search (it’s good practice to set it to at least k).
      • Executes the search using os_client.search().
      • Returns the hits array from the response, which contains the matching documents.
    2. Example Usage (if __name__ == "__main__":)
      • Defines a sample user_query.
      • Loads the same Sentence Transformer model used for embedding the manuals to generate an embedding for the user_query.
      • Calls the perform_knn_search function with the index name, the generated query embedding, and the desired number of results (k=3).
      • Prints the retrieved search results, including their score and a snippet of the content.

    Key Considerations:

    • Embedding Model Consistency: Ensure that you use the same embedding model to generate the query embeddings as you used to embed your manuals. Inconsistent models will result in poor search results.
    • Vector Field Name: Replace "embedding" in the knn query with the actual name of your vector field in the OpenSearch index.
    • k Value: Experiment with different values of k to find the optimal number of relevant results for your application.
    • Distance Metric (Optional): OpenSearch uses the space_type defined in your index mapping (when you created the knn_vector field) as the default distance metric. If you need to specify a different metric for a particular search, you can include a "space_type" parameter within the knn query (though this is less common).
    • Filtering (Optional): You can combine the knn query with other OpenSearch query clauses (like bool, filter, term, etc.) to further refine your search based on metadata (e.g., search within a specific manual or product).

    This comprehensive example demonstrates how to perform a k-NN search in OpenSearch using the Python client, which is the core of how your API would retrieve relevant manual snippets based on a user’s question.

  • Building a Hilariously Insightful Image Recognition Chatbot with Spring AI

    Building a Hilariously Insightful Image Recognition with Spring (and a Touch of Sass)
    While Spring AI’s current spotlight shines on language models, the underlying principles of integration and modularity allow us to construct fascinating applications that extend beyond text. In this article, we’ll embark on a whimsical journey to build an image recognition chatbot powered by a cloud vision and infused with a healthy dose of humor, courtesy of our very own witty “chat client.”
    Core Concepts Revisited:

    • Image Recognition API: The workhorse of our chatbot, a cloud-based service (like Google Cloud Vision AI, Rekognition, or Computer Vision) capable of analyzing images for object detection, classification, captioning, and more.
    • Spring Integration: We’ll leverage the Spring framework to manage components, handle API interactions, and serve our humorous chatbot.
    • Humorous Response Generation: A dedicated component that takes the raw analysis results and transforms them into witty, sarcastic, or otherwise amusing commentary.
      Setting Up Our Spring Boot Project:
      As before, let’s start with a new Spring Boot project. Include dependencies for web handling, file uploads (if needed), and the client library for your chosen cloud vision API. For this example, we’ll use the Google Cloud Vision API. Add the following to your pom.xml:
      org.springframework.boot spring-boot-starter-web org.springframework.boot spring-boot-starter-tomcat org.apache.tomcat.embed tomcat-embed-jasper org.springframework.boot spring-boot-starter-thymeleaf com.google.cloud google-cloud-vision 3.1.0 org.springframework.boot spring-boot-starter-test test

    Integrating with the Google Cloud Vision API:
    First, ensure you have a Google Cloud project set up with the Cloud Vision API enabled and have downloaded your service account key JSON file.
    Now, let’s create the ImageRecognitionClient to interact with the Google Cloud Vision API:
    package com.example.imagechatbot;

    import com.google.cloud.vision.v1.*;
    import org.springframework.beans.factory.annotation.Value;
    import org.springframework.core.io.Resource;
    import org.springframework.stereotype.Service;

    import javax.annotation.PostConstruct;
    import java.io.IOException;
    import java.nio.file.Files;
    import java.util.ArrayList;
    import java.util.List;

    @Service
    public class ImageRecognitionClient {

    private ImageAnnotatorClient visionClient;
    
    @Value("classpath:${gcp.vision.credentials.path}")
    private Resource credentialsResource;
    
    @PostConstruct
    public void initializeVisionClient() throws IOException {
        try {
            String credentialsJson = new String(Files.readAllBytes(credentialsResource.getFile().toPath()));
            visionClient = ImageAnnotatorClient.create(
                    ImageAnnotatorSettings.newBuilder()
                            .setCredentialsProvider(() -> com.google.auth.oauth2.ServiceAccountCredentials.fromStream(credentialsResource.getInputStream()))
                            .build()
            );
        } catch (IOException e) {
            System.error.println("Failed to initialize Vision API client: " + e.getMessage());
            throw e;
        }
    }
    
    public ImageAnalysisResult analyze(byte&lsqb;] imageBytes, List<Feature.Type> features) throws IOException {
        ByteString imgBytes = ByteString.copyFrom(imageBytes);
        Image image = Image.newBuilder().setContent(imgBytes).build();
        List<AnnotateImageRequest> requests = new ArrayList<>();
        List<Feature> featureList = features.stream().map(f -> Feature.newBuilder().setType(f).build()).toList();
        requests.add(AnnotateImageRequest.newBuilder().setImage(image).addAllFeatures(featureList).build());
    
        BatchAnnotateImagesResponse response = visionClient.batchAnnotateImages(requests);
        return processResponse(response);
    }
    
    public ImageAnalysisResult analyze(String imageUrl, List<Feature.Type> features) throws IOException {
        ImageSource imgSource = ImageSource.newBuilder().setImageUri(imageUrl).build();
        Image image = Image.newBuilder().setSource(imgSource).build();
        List<AnnotateImageRequest> requests = new ArrayList<>();
        List<Feature> featureList = features.stream().map(f -> Feature.newBuilder().setType(f).build()).toList();
        requests.add(AnnotateImageRequest.newBuilder().setImage(image).addAllFeatures(featureList).build());
    
        BatchAnnotateImagesResponse response = visionClient.batchAnnotateImages(requests);
        return processResponse(response);
    }
    
    private ImageAnalysisResult processResponse(BatchAnnotateImagesResponse response) {
        ImageAnalysisResult result = new ImageAnalysisResult();
        for (AnnotateImageResponse res : response.getResponsesList()) {
            if (res.hasError()) {
                System.err.println("Error: " + res.getError().getMessage());
                return result; // Return empty result in case of error
            }
    
            List<DetectedObject> detectedObjects = new ArrayList<>();
            for (ObjectLocalization detection : res.getObjectLocalizationAnnotationsList()) {
                detectedObjects.add(new DetectedObject(detection.getName(), detection.getScore()));
            }
            result.setObjectDetections(detectedObjects);
    
            if (res.hasTextAnnotations()) {
                result.setExtractedText(res.getTextAnnotationsList().get(0).getDescription());
            }
    
            if (res.hasImagePropertiesAnnotation()) {
                ColorInfo dominantColor = res.getImagePropertiesAnnotation().getDominantColors().getColorsList().get(0);
                result.setDominantColor(String.format("rgb(%d, %d, %d)",
                        (int) (dominantColor.getColor().getRed() * 255),
                        (int) (dominantColor.getColor().getGreen() * 255),
                        (int) (dominantColor.getColor().getBlue() * 255)));
            }
    
            if (res.hasCropHintsAnnotation() && !res.getCropHintsAnnotation().getCropHintsList().isEmpty()) {
                result.setCropHint(res.getCropHintsAnnotation().getCropHintsList().get(0).getBoundingPoly().getVerticesList().toString());
            }
    
            if (res.hasSafeSearchAnnotation()) {
                SafeSearchAnnotation safeSearch = res.getSafeSearchAnnotation();
                result.setSafeSearchVerdict(String.format("Adult: %s, Spoof: %s, Medical: %s, Violence: %s, Racy: %s",
                        safeSearch.getAdult().name(), safeSearch.getSpoof().name(), safeSearch.getMedical().name(),
                        safeSearch.getViolence().name(), safeSearch.getRacy().name()));
            }
    
            if (res.hasLabelAnnotations()) {
                List<String> labels = res.getLabelAnnotationsList().stream().map(LabelAnnotation::getDescription).toList();
                result.setLabels(labels);
            }
        }
        return result;
    }

    }

    package com.example.imagechatbot;

    import java.util.List;

    public class ImageAnalysisResult {
    private List objectDetections;
    private String extractedText;
    private String dominantColor;
    private String cropHint;
    private String safeSearchVerdict;
    private List labels;

    // Getters and setters
    
    public List<DetectedObject> getObjectDetections() { return objectDetections; }
    public void setObjectDetections(List<DetectedObject> objectDetections) { this.objectDetections = objectDetections; }
    public String getExtractedText() { return extractedText; }
    public void setExtractedText(String extractedText) { this.extractedText = extractedText; }
    public String getDominantColor() { return dominantColor; }
    public void setDominantColor(String dominantColor) { this.dominantColor = dominantColor; }
    public String getCropHint() { return cropHint; }
    public void setCropHint(String cropHint) { this.cropHint = cropHint; }
    public String getSafeSearchVerdict() { return safeSearchVerdict; }
    public void setSafeSearchVerdict(String safeSearchVerdict) { this.safeSearchVerdict = safeSearchVerdict; }
    public List<String> getLabels() { return labels; }
    public void setLabels(List<String> labels) { this.labels = labels; }

    }

    package com.example.imagechatbot;

    public class DetectedObject {
    private String name;
    private float confidence;

    public DetectedObject(String name, float confidence) {
        this.name = name;
        this.confidence = confidence;
    }
    
    // Getters
    public String getName() { return name; }
    public float getConfidence() { return confidence; }

    }

    Remember to configure the gcp.vision.credentials.path in your application.properties file to point to your Google Cloud service account key JSON file.
    Crafting the Humorous Chat Client:
    Now, let’s implement our HumorousResponseGenerator to add that much-needed comedic flair to the AI’s findings.
    package com.example.imagechatbot;

    import org.springframework.stereotype.Service;

    import java.util.List;

    @Service
    public class HumorousResponseGenerator {

    public String generateHumorousResponse(ImageAnalysisResult result) {
        StringBuilder sb = new StringBuilder();
    
        if (result.getObjectDetections() != null && !result.getObjectDetections().isEmpty()) {
            sb.append("Alright, buckle up, folks! The AI, after intense digital contemplation, has spotted:\n");
            for (DetectedObject obj : result.getObjectDetections()) {
                sb.append("- A '").append(obj.getName()).append("' (with a ").append(String.format("%.2f", obj.getConfidence() * 100)).append("% certainty). So, you know, maybe.\n");
            }
        } else {
            sb.append("The AI peered into the digital abyss and found... nada. Either the image is a profound statement on the void, or it's just blurry.");
        }
    
        if (result.getExtractedText() != null) {
            sb.append("\nIt also managed to decipher some ancient runes: '").append(result.getExtractedText()).append("'. The wisdom of the ages, right there.");
        }
    
        if (result.getDominantColor() != null) {
            sb.append("\nThe artistic highlight? The dominant color is apparently ").append(result.getDominantColor()).append(". Groundbreaking stuff.");
        }
    
        if (result.getSafeSearchVerdict() != null) {
            sb.append("\nGood news, everyone! According to the AI's highly sensitive sensors: ").append(result.getSafeSearchVerdict()).append(". We're all safe (for now).");
        }
    
        if (result.getLabels() != null && !result.getLabels().isEmpty()) {
            sb.append("\nAnd finally, the AI's attempt at summarizing the essence of the image: '").append(String.join(", ", result.getLabels())).append("'. Deep, I tell you, deep.");
        }
    
        return sb.toString();
    }

    }

    Wiring it All Together in the Controller:
    Finally, let’s connect our ImageChatController to use both the ImageRecognitionClient and the HumorousResponseGenerator.
    package com.example.imagechatbot;

    import com.google.cloud.vision.v1.Feature;
    import org.springframework.stereotype.Controller;
    import org.springframework.ui.Model;
    import org.springframework.web.bind.annotation.GetMapping;
    import org.springframework.web.bind.annotation.PostMapping;
    import org.springframework.web.bind.annotation.RequestParam;
    import org.springframework.web.multipart.MultipartFile;

    import java.io.IOException;
    import java.util.List;

    @Controller
    public class ImageChatController {

    private final ImageRecognitionClient imageRecognitionClient;
    private final HumorousResponseGenerator humorousResponseGenerator;
    
    public ImageChatController(ImageRecognitionClient imageRecognitionClient, HumorousResponseGenerator humorousResponseGenerator) {
        this.imageRecognitionClient = imageRecognitionClient;
        this.humorousResponseGenerator = humorousResponseGenerator;
    }
    
    @GetMapping("/")
    public String showUploadForm() {
        return "uploadForm";
    }
    
    @PostMapping("/analyzeImage")
    public String analyzeUploadedImage(@RequestParam("imageFile") MultipartFile imageFile, Model model) throws IOException {
        if (!imageFile.isEmpty()) {
            byte&lsqb;] imageBytes = imageFile.getBytes();
            ImageAnalysisResult analysisResult = imageRecognitionClient.analyze(imageBytes, List.of(Feature.Type.OBJECT_LOCALIZATION, Feature.Type.TEXT_DETECTION, Feature.Type.IMAGE_PROPERTIES, Feature.Type.SAFE_SEARCH_DETECTION, Feature.Type.LABEL_DETECTION));
            String humorousResponse = humorousResponseGenerator.generateHumorousResponse(analysisResult);
            model.addAttribute("analysisResult", humorousResponse);
        } else {
            model.addAttribute("errorMessage", "Please upload an image.");
        }
        return "analysisResult";
    }
    
    @GetMapping("/analyzeImageUrlForm")
    public String showImageUrlForm() {
        return "imageUrlForm";
    }
    
    @PostMapping("/analyzeImageUrl")
    public String analyzeImageFromUrl(@RequestParam("imageUrl") String imageUrl, Model model) throws IOException {
        if (!imageUrl.isEmpty()) {
            ImageAnalysisResult analysisResult = imageRecognitionClient.analyze(imageUrl, List.of(Feature.Type.OBJECT_LOCALIZATION, Feature.Type.TEXT_DETECTION, Feature.Type.IMAGE_PROPERTIES, Feature.Type.SAFE_SEARCH_DETECTION, Feature.Type.LABEL_DETECTION));
            String humorousResponse = humorousResponseGenerator.generateHumorousResponse(analysisResult);
            model.addAttribute("analysisResult", humorousResponse);
        } else {
            model.addAttribute("errorMessage", "Please provide an image URL.");
        }
        return "analysisResult";
    }

    }

    Basic Thymeleaf Templates:
    Create the following Thymeleaf templates in your src/main/resources/templates directory:
    uploadForm.html:

    Upload Image

    Upload an Image for Hilarious Analysis

    Analyze!

    imageUrlForm.html:

    Analyze Image via URL

    Provide an Image URL for Witty Interpretation

    Image URL: Analyze!

    analysisResult.html:

    Analysis Result

    Image Analysis (with Commentary)

    Upload Another Image

    Analyze Image from URL

    Configuration:
    In your src/main/resources/application.properties, add the path to your Google Cloud service account key file:
    gcp.vision.credentials.path=path/to/your/serviceAccountKey.json

    Replace path/to/your/serviceAccountKey.json with the actual path to your credentials file.
    Conclusion:
    While Spring AI’s direct image processing capabilities might evolve, this example vividly demonstrates how you can leverage the framework’s robust features to build an image recognition chatbot with a humorous twist. By cleanly separating the concerns of API interaction (within ImageRecognitionClient) and witty response generation (HumorousResponseGenerator), we’ve crafted a modular and (hopefully) entertaining application. Remember to replace the Google Cloud Vision API integration with your preferred cloud provider’s SDK if needed. Now, go forth and build a chatbot that not only sees but also makes you chuckle!

  • Retrieval Augmented Generation (RAG) with LLMs

    Retrieval Augmented Generation () is a technique that enhances the capabilities of Large Language Models (LLMs) by enabling them to access and incorporate information from external sources during the response generation process. This approach addresses some of the inherent limitations of LLMs, such as their inability to access up-to-date information or domain-specific knowledge.

    How RAG Works

    The RAG process involves the following key steps:

    1. Retrieval:
      • The user provides a query or prompt.
      • The RAG system uses a retrieval mechanism (e.g., semantic search, vector ) to fetch relevant information or documents from an external knowledge base.
      • This knowledge base can consist of various sources, including documents, databases, web pages, and APIs.
    2. Augmentation:
      • The retrieved information is combined with the original user query.
      • This augmented prompt provides the with additional context and relevant information.
    3. Generation:
      • The LLM uses the augmented prompt to generate a more informed and accurate response.
      • By grounding the response in external knowledge, RAG helps to reduce hallucinations and improve factual accuracy.

    Benefits of RAG

    • Improved Accuracy and Factuality: RAG reduces the risk of LLM hallucinations by grounding responses in reliable external sources.
    • Access to Up-to-Date Information: RAG enables LLMs to provide responses based on the latest information, overcoming the limitations of their static training data.
    • Domain-Specific Knowledge: RAG allows LLMs to access and utilize domain-specific knowledge, making them more effective for specialized applications.
    • Increased Transparency and Explainability: RAG systems can provide references to the retrieved sources, allowing users to verify the information and understand the basis for the LLM’s response.
    • Reduced Need for Retraining: RAG eliminates the need to retrain LLMs every time new information becomes available.

    RAG vs. Fine-tuning

    RAG and fine-tuning are two techniques for adapting LLMs to specific tasks or domains.

    • RAG: Retrieves relevant information at query time to augment the LLM’s input.
    • Fine-tuning: Updates the LLM’s parameters by training it on a specific dataset.

    RAG is generally preferred when:

    • The knowledge base is frequently updated.
    • The application requires access to a wide range of information sources.
    • Transparency and explainability are important.
    • Cost-effective and faster way to introduce new data to LLMs.

    Fine-tuning is more suitable when:

    • The LLM needs to learn a specific style or format.
    • The application requires improved performance on a narrow domain.
    • The knowledge is static and well-defined.

    Applications of RAG

    RAG can be applied to various applications, including:

    • Question Answering: Providing accurate and contextually relevant answers to user questions.
    • Chatbots: Enhancing responses with information from knowledge bases or documentation.
    • Content Generation: Generating more informed and engaging content for articles, blog posts, and marketing materials.
    • Summarization: Summarizing lengthy documents or articles by incorporating relevant information from external sources.
    • Search: Improving search results by providing more contextually relevant and comprehensive information.

    Challenges and Considerations

    • Retrieval Quality: The effectiveness of RAG depends on the quality of the retrieved information. Inaccurate or irrelevant information can negatively impact the LLM’s response.
    • Scalability: RAG systems need to be scalable to handle large knowledge bases and high query volumes.
    • Latency: The retrieval process can add latency to the response generation process.
    • Data Management: Keeping the external knowledge base up-to-date and accurate is crucial for maintaining the effectiveness of RAG.

    Conclusion

    RAG is a promising technique that enhances LLMs’ capabilities by enabling them to access and incorporate information from external sources. By grounding LLM responses in reliable knowledge, RAG improves accuracy, reduces hallucinations, and expands the range of applications for LLMs. As LLMs continue to evolve, RAG is likely to play an increasingly important role in building more effective, reliable, and trustworthy systems.