Building a Hilariously Insightful Image Recognition Chatbot with Spring AI

Building a Hilariously Insightful Image Recognition with Spring (and a Touch of Sass)
While Spring AI’s current spotlight shines on language models, the underlying principles of integration and modularity allow us to construct fascinating applications that extend beyond text. In this article, we’ll embark on a whimsical journey to build an image recognition chatbot powered by a cloud vision and infused with a healthy dose of humor, courtesy of our very own witty “chat client.”
Core Concepts Revisited:

  • Image Recognition API: The workhorse of our chatbot, a cloud-based service (like Google Cloud Vision AI, Rekognition, or Computer Vision) capable of analyzing images for object detection, classification, captioning, and more.
  • Spring Integration: We’ll leverage the Spring framework to manage components, handle API interactions, and serve our humorous chatbot.
  • Humorous Response Generation: A dedicated component that takes the raw analysis results and transforms them into witty, sarcastic, or otherwise amusing commentary.
    Setting Up Our Spring Boot Project:
    As before, let’s start with a new Spring Boot project. Include dependencies for web handling, file uploads (if needed), and the client library for your chosen cloud vision API. For this example, we’ll use the Google Cloud Vision API. Add the following to your pom.xml:
    org.springframework.boot spring-boot-starter-web org.springframework.boot spring-boot-starter-tomcat org.apache.tomcat.embed tomcat-embed-jasper org.springframework.boot spring-boot-starter-thymeleaf com.google.cloud google-cloud-vision 3.1.0 org.springframework.boot spring-boot-starter-test test

Integrating with the Google Cloud Vision API:
First, ensure you have a Google Cloud project set up with the Cloud Vision API enabled and have downloaded your service account key JSON file.
Now, let’s create the ImageRecognitionClient to interact with the Google Cloud Vision API:
package com.example.imagechatbot;

import com.google.cloud.vision.v1.*;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.core.io.Resource;
import org.springframework.stereotype.Service;

import javax.annotation.PostConstruct;
import java.io.IOException;
import java.nio.file.Files;
import java.util.ArrayList;
import java.util.List;

@Service
public class ImageRecognitionClient {

private ImageAnnotatorClient visionClient;

@Value("classpath:${gcp.vision.credentials.path}")
private Resource credentialsResource;

@PostConstruct
public void initializeVisionClient() throws IOException {
    try {
        String credentialsJson = new String(Files.readAllBytes(credentialsResource.getFile().toPath()));
        visionClient = ImageAnnotatorClient.create(
                ImageAnnotatorSettings.newBuilder()
                        .setCredentialsProvider(() -> com.google.auth.oauth2.ServiceAccountCredentials.fromStream(credentialsResource.getInputStream()))
                        .build()
        );
    } catch (IOException e) {
        System.error.println("Failed to initialize Vision API client: " + e.getMessage());
        throw e;
    }
}

public ImageAnalysisResult analyze(byte&lsqb;] imageBytes, List<Feature.Type> features) throws IOException {
    ByteString imgBytes = ByteString.copyFrom(imageBytes);
    Image image = Image.newBuilder().setContent(imgBytes).build();
    List<AnnotateImageRequest> requests = new ArrayList<>();
    List<Feature> featureList = features.stream().map(f -> Feature.newBuilder().setType(f).build()).toList();
    requests.add(AnnotateImageRequest.newBuilder().setImage(image).addAllFeatures(featureList).build());

    BatchAnnotateImagesResponse response = visionClient.batchAnnotateImages(requests);
    return processResponse(response);
}

public ImageAnalysisResult analyze(String imageUrl, List<Feature.Type> features) throws IOException {
    ImageSource imgSource = ImageSource.newBuilder().setImageUri(imageUrl).build();
    Image image = Image.newBuilder().setSource(imgSource).build();
    List<AnnotateImageRequest> requests = new ArrayList<>();
    List<Feature> featureList = features.stream().map(f -> Feature.newBuilder().setType(f).build()).toList();
    requests.add(AnnotateImageRequest.newBuilder().setImage(image).addAllFeatures(featureList).build());

    BatchAnnotateImagesResponse response = visionClient.batchAnnotateImages(requests);
    return processResponse(response);
}

private ImageAnalysisResult processResponse(BatchAnnotateImagesResponse response) {
    ImageAnalysisResult result = new ImageAnalysisResult();
    for (AnnotateImageResponse res : response.getResponsesList()) {
        if (res.hasError()) {
            System.err.println("Error: " + res.getError().getMessage());
            return result; // Return empty result in case of error
        }

        List<DetectedObject> detectedObjects = new ArrayList<>();
        for (ObjectLocalization detection : res.getObjectLocalizationAnnotationsList()) {
            detectedObjects.add(new DetectedObject(detection.getName(), detection.getScore()));
        }
        result.setObjectDetections(detectedObjects);

        if (res.hasTextAnnotations()) {
            result.setExtractedText(res.getTextAnnotationsList().get(0).getDescription());
        }

        if (res.hasImagePropertiesAnnotation()) {
            ColorInfo dominantColor = res.getImagePropertiesAnnotation().getDominantColors().getColorsList().get(0);
            result.setDominantColor(String.format("rgb(%d, %d, %d)",
                    (int) (dominantColor.getColor().getRed() * 255),
                    (int) (dominantColor.getColor().getGreen() * 255),
                    (int) (dominantColor.getColor().getBlue() * 255)));
        }

        if (res.hasCropHintsAnnotation() && !res.getCropHintsAnnotation().getCropHintsList().isEmpty()) {
            result.setCropHint(res.getCropHintsAnnotation().getCropHintsList().get(0).getBoundingPoly().getVerticesList().toString());
        }

        if (res.hasSafeSearchAnnotation()) {
            SafeSearchAnnotation safeSearch = res.getSafeSearchAnnotation();
            result.setSafeSearchVerdict(String.format("Adult: %s, Spoof: %s, Medical: %s, Violence: %s, Racy: %s",
                    safeSearch.getAdult().name(), safeSearch.getSpoof().name(), safeSearch.getMedical().name(),
                    safeSearch.getViolence().name(), safeSearch.getRacy().name()));
        }

        if (res.hasLabelAnnotations()) {
            List<String> labels = res.getLabelAnnotationsList().stream().map(LabelAnnotation::getDescription).toList();
            result.setLabels(labels);
        }
    }
    return result;
}

}

package com.example.imagechatbot;

import java.util.List;

public class ImageAnalysisResult {
private List objectDetections;
private String extractedText;
private String dominantColor;
private String cropHint;
private String safeSearchVerdict;
private List labels;

// Getters and setters

public List<DetectedObject> getObjectDetections() { return objectDetections; }
public void setObjectDetections(List<DetectedObject> objectDetections) { this.objectDetections = objectDetections; }
public String getExtractedText() { return extractedText; }
public void setExtractedText(String extractedText) { this.extractedText = extractedText; }
public String getDominantColor() { return dominantColor; }
public void setDominantColor(String dominantColor) { this.dominantColor = dominantColor; }
public String getCropHint() { return cropHint; }
public void setCropHint(String cropHint) { this.cropHint = cropHint; }
public String getSafeSearchVerdict() { return safeSearchVerdict; }
public void setSafeSearchVerdict(String safeSearchVerdict) { this.safeSearchVerdict = safeSearchVerdict; }
public List<String> getLabels() { return labels; }
public void setLabels(List<String> labels) { this.labels = labels; }

}

package com.example.imagechatbot;

public class DetectedObject {
private String name;
private float confidence;

public DetectedObject(String name, float confidence) {
    this.name = name;
    this.confidence = confidence;
}

// Getters
public String getName() { return name; }
public float getConfidence() { return confidence; }

}

Remember to configure the gcp.vision.credentials.path in your application.properties file to point to your Google Cloud service account key JSON file.
Crafting the Humorous Chat Client:
Now, let’s implement our HumorousResponseGenerator to add that much-needed comedic flair to the AI’s findings.
package com.example.imagechatbot;

import org.springframework.stereotype.Service;

import java.util.List;

@Service
public class HumorousResponseGenerator {

public String generateHumorousResponse(ImageAnalysisResult result) {
    StringBuilder sb = new StringBuilder();

    if (result.getObjectDetections() != null && !result.getObjectDetections().isEmpty()) {
        sb.append("Alright, buckle up, folks! The AI, after intense digital contemplation, has spotted:\n");
        for (DetectedObject obj : result.getObjectDetections()) {
            sb.append("- A '").append(obj.getName()).append("' (with a ").append(String.format("%.2f", obj.getConfidence() * 100)).append("% certainty). So, you know, maybe.\n");
        }
    } else {
        sb.append("The AI peered into the digital abyss and found... nada. Either the image is a profound statement on the void, or it's just blurry.");
    }

    if (result.getExtractedText() != null) {
        sb.append("\nIt also managed to decipher some ancient runes: '").append(result.getExtractedText()).append("'. The wisdom of the ages, right there.");
    }

    if (result.getDominantColor() != null) {
        sb.append("\nThe artistic highlight? The dominant color is apparently ").append(result.getDominantColor()).append(". Groundbreaking stuff.");
    }

    if (result.getSafeSearchVerdict() != null) {
        sb.append("\nGood news, everyone! According to the AI's highly sensitive sensors: ").append(result.getSafeSearchVerdict()).append(". We're all safe (for now).");
    }

    if (result.getLabels() != null && !result.getLabels().isEmpty()) {
        sb.append("\nAnd finally, the AI's attempt at summarizing the essence of the image: '").append(String.join(", ", result.getLabels())).append("'. Deep, I tell you, deep.");
    }

    return sb.toString();
}

}

Wiring it All Together in the Controller:
Finally, let’s connect our ImageChatController to use both the ImageRecognitionClient and the HumorousResponseGenerator.
package com.example.imagechatbot;

import com.google.cloud.vision.v1.Feature;
import org.springframework.stereotype.Controller;
import org.springframework.ui.Model;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.multipart.MultipartFile;

import java.io.IOException;
import java.util.List;

@Controller
public class ImageChatController {

private final ImageRecognitionClient imageRecognitionClient;
private final HumorousResponseGenerator humorousResponseGenerator;

public ImageChatController(ImageRecognitionClient imageRecognitionClient, HumorousResponseGenerator humorousResponseGenerator) {
    this.imageRecognitionClient = imageRecognitionClient;
    this.humorousResponseGenerator = humorousResponseGenerator;
}

@GetMapping("/")
public String showUploadForm() {
    return "uploadForm";
}

@PostMapping("/analyzeImage")
public String analyzeUploadedImage(@RequestParam("imageFile") MultipartFile imageFile, Model model) throws IOException {
    if (!imageFile.isEmpty()) {
        byte&lsqb;] imageBytes = imageFile.getBytes();
        ImageAnalysisResult analysisResult = imageRecognitionClient.analyze(imageBytes, List.of(Feature.Type.OBJECT_LOCALIZATION, Feature.Type.TEXT_DETECTION, Feature.Type.IMAGE_PROPERTIES, Feature.Type.SAFE_SEARCH_DETECTION, Feature.Type.LABEL_DETECTION));
        String humorousResponse = humorousResponseGenerator.generateHumorousResponse(analysisResult);
        model.addAttribute("analysisResult", humorousResponse);
    } else {
        model.addAttribute("errorMessage", "Please upload an image.");
    }
    return "analysisResult";
}

@GetMapping("/analyzeImageUrlForm")
public String showImageUrlForm() {
    return "imageUrlForm";
}

@PostMapping("/analyzeImageUrl")
public String analyzeImageFromUrl(@RequestParam("imageUrl") String imageUrl, Model model) throws IOException {
    if (!imageUrl.isEmpty()) {
        ImageAnalysisResult analysisResult = imageRecognitionClient.analyze(imageUrl, List.of(Feature.Type.OBJECT_LOCALIZATION, Feature.Type.TEXT_DETECTION, Feature.Type.IMAGE_PROPERTIES, Feature.Type.SAFE_SEARCH_DETECTION, Feature.Type.LABEL_DETECTION));
        String humorousResponse = humorousResponseGenerator.generateHumorousResponse(analysisResult);
        model.addAttribute("analysisResult", humorousResponse);
    } else {
        model.addAttribute("errorMessage", "Please provide an image URL.");
    }
    return "analysisResult";
}

}

Basic Thymeleaf Templates:
Create the following Thymeleaf templates in your src/main/resources/templates directory:
uploadForm.html:

Upload Image

Upload an Image for Hilarious Analysis

Analyze!

imageUrlForm.html:

Analyze Image via URL

Provide an Image URL for Witty Interpretation

Image URL: Analyze!

analysisResult.html:

Analysis Result

Image Analysis (with Commentary)

Upload Another Image

Analyze Image from URL

Configuration:
In your src/main/resources/application.properties, add the path to your Google Cloud service account key file:
gcp.vision.credentials.path=path/to/your/serviceAccountKey.json

Replace path/to/your/serviceAccountKey.json with the actual path to your credentials file.
Conclusion:
While Spring AI’s direct image processing capabilities might evolve, this example vividly demonstrates how you can leverage the framework’s robust features to build an image recognition chatbot with a humorous twist. By cleanly separating the concerns of API interaction (within ImageRecognitionClient) and witty response generation (HumorousResponseGenerator), we’ve crafted a modular and (hopefully) entertaining application. Remember to replace the Google Cloud Vision API integration with your preferred cloud provider’s SDK if needed. Now, go forth and build a chatbot that not only sees but also makes you chuckle!