Generative AI – Visual Encyclopedia App

In this tutorial, we will create an app that uses the Google Gemini API to analyze photos taken by the user and explain what is shown in the picture in a simple and easy-to-understand way, using information from encyclopedia references. We have also added a text-to-speech feature that reads the explanations aloud. We believe this app will be useful for students’ learning.

You can download the source code (aia file) from the download section at the bottom of the page.

For calling the Google Gemini API, we’ll use Google Apps Script (GAS).

The Gemini API created in Google AI Studio can be used for free with Gemini 2.5 Pro within the following limits:

Requests per minute (RPM): 5

Tokens per minute (input) (TPM): 250,000

Requests per day (RPD): 100

What You Will Learn in This Tutorial

How to create an API with GAS

How to write a GAS script using Gemini

How to integrate with the Gemini API

Components Used

Button, Label, TextBox, Image, Web, Camera, TextToSpeech, ImageBase641

Blocks Used

Global Variable, if then else, Dictionaries

Setting up Google Apps Script

For details on configuring the Gemini API, deploying it as a web application, and how to use it, please refer to the Generative AI – Doodle Coach App.

Code.gs

/**
 * Function to handle POST requests from App Inventor
 * @param {Object} e - The event object sent from App Inventor
 * @return {ContentService.TextOutput} - The result from the Gemini API in JSON format
 */
function doPost(e) {
  // Error Handling
  try {
    // 1. Parse the JSON data sent from App Inventor
    const postData = JSON.parse(e.postData.contents);
    const base64Image = postData.image; // Assumes the key is "image" on the App Inventor side

    // 2. Read the API key from Script Properties
    const API_KEY = PropertiesService.getScriptProperties().getProperty('API_KEY');
    if (!API_KEY) {
      return createErrorResponse('API key is not set.');
    }

    // 3. Gemini API Endpoint URL
    const API_URL = `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro:generateContent?key=${API_KEY}`;
  
    // 4. Create the request body to send to the Gemini API
    const payload = {
      "contents": [{
        "parts": [{
          "text": `# Instructions
You are a doctor with encyclopedic knowledge spanning all phenomena. Carefully analyze the provided image and create an explanatory text following the steps below.

# Execution Steps
1.  **Subject Identification:** Accurately identify the main subject (object, living thing, location, person, etc.) depicted in the image.
2.  **Information Extraction:** Extract the "Name," "Most Important Characteristics," and "Historical Context or Interesting Facts" about the identified subject from your knowledge base.
3.  **Article Composition:** Based on the extracted information, begin writing with the core message you want to convey, composing a natural text that captures the reader's interest.

# Output Constraints
* Character Count: Maximum 200 characters
* Paragraphs: Must be completed in a single paragraph.
* Format: Absolutely no bullet points are allowed.
* Style: The tone should be authoritative yet concise and simple, easily understandable even to those unfamiliar with the field.`
        }, {
          "inline_data": {
            "mime_type": "image/png", // App Inventor Canvas is typically PNG format
            "data": base64Image
          }
        }]
      }]
    };

    // 5. Settings for sending the request to the Gemini API
    const options = {
      'method': 'post',
      'contentType': 'application/json',
      'payload': JSON.stringify(payload),
      'muteHttpExceptions': true // Get error response as an object, not an exception
    };

    // 6. Send the request to the API
    const response = UrlFetchApp.fetch(API_URL, options);
    const responseCode = response.getResponseCode();
    const responseBody = response.getContentText();

    if (responseCode !== 200) {
       return createErrorResponse(`API request error: ${responseCode} ${responseBody}`);
    }
    
    // 7. Parse the response from the API
    const resultJson = JSON.parse(responseBody);
    
    // Check the response structure and extract the text part
    let generatedText = "Failed to retrieve analysis results.";
    if (resultJson.candidates && resultJson.candidates.length > 0 &&
        resultJson.candidates[0].content && resultJson.candidates[0].content.parts &&
        resultJson.candidates[0].content.parts.length > 0) {
      generatedText = resultJson.candidates[0].content.parts[0].text;
    } else {
       // If the response from Gemini is not in the expected format
       return createErrorResponse(`Unexpected API response format: ${responseBody}`);
    }

    // 8. Create the JSON to return to App Inventor
    const output = JSON.stringify({
      "status": "success",
      "result": generatedText
    });

    // 9. Return the result in JSON format
    return ContentService.createTextOutput(output).setMimeType(ContentService.MimeType.JSON);

  } catch (error) {
    // Catch errors that occur throughout the script
    return createErrorResponse(`Script error: ${error.toString()}`);
  }
}

/**
 * Helper function to generate an error response
 * @param {string} message - The error message
 * @return {ContentService.TextOutput} - A JSON object containing error information
 */
function createErrorResponse(message) {
  const errorOutput = JSON.stringify({
    "status": "error",
    "message": message
  });
  return ContentService.createTextOutput(errorOutput).setMimeType(ContentService.MimeType.JSON);
}

App Inventor App

From the [Projects] menu, select [Start new project] and name it “VisualEncyclopedia“.

Designer

You need to be logged in to view the rest of the content. Please . Not a Member? Join Us

AI,API,Cloud,Gemini,Generative AI,Google Apps Script,Intermediate