Design and Fabrication of Talking Robot

Authors: Akshay Prakash J S, Sentahmizh Chittu D, Nandhini V, Abhishek N, Mr. G. Chandrasekar

Affiliation: Dhanalakshmi Srinivasan Engineering College (Autonomous), Perambalur, India

Abstract

This paper presents the design and fabrication of a mini talking robot. The project aims to create an artificial intelligence voice assistant robot using pre-recorded audio for speech. The goal is to develop a robot with a low manufacturing cost, incorporating advanced inventions. The paper discusses the current work, focusing on the challenges of grounding user utterances in the robot's environment, leveraging advancements in speech technologies and the potential for vocal interfaces in consumer robots.

Keywords

Robot

I. INTRODUCTION

A talking robot is a type of robot capable of producing speech or other vocalizations for communication. These robots can be programmed to speak in various languages and are used in applications like customer service, language learning, and entertainment. They can be controlled by computer programs or human operators, utilizing natural language processing to understand and respond to human speech. Talking robots employ text-to-speech (TTS) technology to convert text into spoken words, enabling natural communication. Applications span customer service, education, and entertainment, with development extending to healthcare and other industries requiring human-like interaction. Examples include personal assistants like Amazon's Alexa and Google Home, and advanced robots like Hanson Robotics' Sophia.

II. LITERATURE SURVEY

III. EXISTING SYSTEM AND LIMITATIONS

3.1 Existing Model

Current AI voice assistant robots include Amazon's Echo (Alexa), Google Home (Google Assistant), and Apple's Home Pod (Siri), designed for voice commands and tasks like playing music and setting reminders. They integrate with smart home devices like thermostats and lights. Other manufacturers include Sonos, Harman Kardon, and JBL. These robots are often integrated into smart speakers or other devices. Standalone AI voice assistant robots like Jibo and Kuri also exist. The Sofia robot by Hanson Robotics is a humanoid AI robot capable of recognizing and responding to human emotions, used in customer service, education, and entertainment, featuring expressive features for natural interaction.

3.2 Limitation of Existing Models

Existing voice assistant robots have several limitations:

The project focuses on minimizing the building cost of the robot.

IV. PROPOSED SYSTEM

The proposed project is a talking robot capable of producing speech or spoken words through text-to-speech synthesis, pre-recorded speech, or a combination. Talking robots find applications in customer service, language learning, and entertainment. Options for building a budget-friendly talking robot include using a Raspberry Pi as the central unit, connected to a microphone and speaker, and utilizing open-source software like Python and the Open Speech Platform. Alternatively, a microcontroller like Arduino with a pre-built speech recognition module can be used. The cost of materials varies by design, but both options are relatively inexpensive compared to commercial alternatives. The primary focus of this project is to minimize robot expense.

V. HARDWARE COMPONENTS

5.1 Raspberry Pi 4

The Raspberry Pi is a series of small single-board computers developed in the UK. Initially intended for teaching computer science, it gained popularity for uses like robotics due to its low cost, modularity, and open design. It is commonly used by computer and electronics hobbyists.

Description of Fig 5.1.1 RASBERRY PI 4: A photograph of the Raspberry Pi 4 single-board computer, showing its compact size and various ports and connectors.

5.2 Arduino Uno

The Arduino Uno is an open-source microcontroller board based on the ATmega328P microcontroller. It features digital and analog input/output pins for interfacing with various expansion boards and circuits. It is programmable via the Arduino IDE using a USB cable.

Description of Fig 5.2.1 ARDUINO UNO: A photograph of the Arduino Uno microcontroller board, highlighting its digital and analog pins, USB port, and ATmega328P chip.

5.3 L298N Motor Driver

The L298N Motor Driver Module is a high-power module for driving DC and Stepper Motors. It contains an L298 motor driver IC and a 78M05 5V regulator, capable of controlling up to 4 DC motors or 2 DC motors with directional and speed control using an H-Bridge.

Description of Fig 5.3.1 L298N Motor Driver: A photograph of the L298N Motor Driver module, a circuit board designed to control DC motors.

5.4 Geared DC Motor

A geared DC motor has a gear assembly attached to increase torque and reduce speed. It features a 3 mm threaded drill hole in the shaft for easy connection to wheels or other mechanical assemblies.

Description of Fig 5.4.1 GEARED MOTOR: A photograph of a geared DC motor, showing the motor unit attached to a gearbox with a yellow wheel.

5.5 Servo Motor

A servo motor is a rotary actuator allowing precise control of angular position, velocity, and acceleration. It includes a motor coupled with a sensor for position feedback, offering good torque, holding power, and fast updates.

Description of Fig 5.5.1 SERVO MOTOR MG995: A photograph of a servo motor, model MG995, commonly used for precise angular control in robotics.

5.6 Ultrasonic Sensor

An ultrasonic sensor is an electronic device that emits ultrasonic sound waves and converts reflected sound into an electrical signal to determine distance. Ultrasonic waves travel faster than audible sound. It consists of a transmitter and a receiver.

Description of Fig 5.6.1 Ultrasonic Sensor: A photograph of the HC-SR04 ultrasonic sensor module, showing its two transducers (transmitter and receiver) and circuit board.

5.7 Speaker

Speakers are output devices used to generate sound, connecting to computers or sound systems. They convert electromagnetic waves into sound waves.

Description of Fig 5.7.1 SPEAKER: Zebronics Pluto speaker: A photograph of a pair of Zebronics Pluto speakers.

5.8 Mic

A microphone captures audio by converting sound waves into electrical signals. The first electronic microphone used a liquid mechanism with a diaphragm.

Description of Fig 5.8.1 Mic: A photograph of a microphone with a stand, designed for audio input.

5.9 Battery

This battery pack uses ICR 18650 2500mAh 20C Lithium-Ion Batteries and a BMS circuit, offering small size and weight. It includes charge protection for direct charging via a DC adapter.

Description of Fig 5.9.1 BATTERY: A photograph of a Lithium-Ion battery pack, labeled with a warning, indicating its specifications.

VI. WORKING

6.1 Circuit Diagram

Description of FIG 6.1a ROBOT MOTION CONTROLER: A circuit diagram illustrating the connections between the Arduino Uno, L298N Motor Driver, motors, ultrasonic sensor, and power source for robot motion control.

Description of FIG 6.1b VOICE ASSISTANT: A block diagram showing the components of the voice assistant system, including Raspberry Pi, speaker, microphone, cloud services (Google Assistant, Dialogflow), and TTS engine, connected to the robot's systems.

6.2 Talking System

The talking system is a primary system for the robot, powered by a Raspberry Pi 4 and a 12V DC supply. The Raspberry Pi 4, speaker, and microphone are integrated into the robot's body. The system allows users to ask questions, which are processed by searching the internet. Audio is converted to text, then to audio output.

6.2.1 Key Points

6.3 Motion Control

The robot walks automatically using an ultrasonic sensor, enabling it to detect and avoid obstacles, making decisions autonomously. This feature makes it a smart, autonomous robot. The project also covers the use of ultrasonic sensors and serial monitors.

VII. SAMPLE CODE

The following Python code demonstrates a basic AI personal assistant named 'G-One'. It includes functions for wishing the user, taking voice commands, searching Wikipedia, opening web pages (YouTube, Google, Gmail), and retrieving weather information.

import speech_recognition as sr
import pyttsx3
import datetime
import wikipedia
import webbrowser
import os
import time
import subprocess
from ecapture import ecapture as ec
import wolframalpha
import json
import requests

print('Loading your AI personal assistant - G One')
engine=pyttsx3.init('sapi5')
voices=engine.getProperty('voices')
engine.setProperty('voice','voices[0].id')

def speak(text):
    engine.say(text)
    engine.runAndWait()

def wishMe():
    hour = datetime.datetime.now().hour
    if hour >= 0 and hour < 12:
        speak("Hello, Good Morning")
        print("Hello, Good Morning")
    elif hour >= 12 and hour < 18:
        speak("Hello, Good Afternoon")
        print("Hello, Good Afternoon")
    else:
        speak("Hello,Good Evening")
        print("Hello,Good Evening")

def takeCommand():
    r = sr.Recognizer()
    with sr.Microphone() as source:
        print("Listening...")
        audio = r.listen(source)

    try:
        statement = r.recognize_google(audio, language='en-in')
        print(f"user said: {statement}\n")

    except Exception as e:
        speak("Pardon me, please say that again")
        return "None"
    return statement

speak("Loading your AI personal assistant G-One")
wishMe()

if __name__ == '__main__':
    while True:
        speak("Tell me how can I help you now?")
        statement = takeCommand().lower()
        if statement == 0:
            continue

        if "good bye" in statement or "ok bye" in statement or "stop" in statement:
            speak('your personal assistant G-one is shutting down,Good bye')
            print('your personal assistant G-one is shutting down, Good bye')
            break

        if 'wikipedia' in statement:
            speak('Searching Wikipedia...')
            statement = statement.replace("wikipedia", "")
            results = wikipedia.summary(statement, sentences=3)
            speak("According to Wikipedia")
            print(results)
            speak(results)

        elif 'open youtube' in statement:
            webbrowser.open_new_tab("https://www.youtube.com")
            speak("youtube is open now")
            time.sleep(5)

        elif 'open google' in statement:
            webbrowser.open_new_tab("https://www.google.com")
            speak("Google chrome is open now")
            time.sleep(5)

        elif 'open gmail' in statement:
            webbrowser.open_new_tab("gmail.com")
            speak("Google Mail open now")

        elif "weather" in statement:
            api_key = "8ef61edcf1c576d65d836254e11ea420"
            speak("whats the city name")
            city_name = takeCommand()
            base_url = "https://api.openweathermap.org/data/2.5/weather?"
            complete_url = base_url + "appid=" + api_key + "&q=" + city_name
            response = requests.get(complete_url)
            x = response.json()
            if x["cod"] != "404":
                y = x["main"]
                current_temperature = y["temp"]
                current_humidiy = y["humidity"]
                z = x["weather"]
                weather_description = z[0]["description"]
                speak(f" Temperature in kelvin unit is {current_temperature} \n humidity in percentage is {current_humidiy} \n description {weather_description})")
                print(f" Temperature in kelvin unit = {current_temperature} \n humidity (in percentage) = {current_humidiy} \n description = {weather_description})")
            else:
                speak(" City Not Found ")

        elif 'time' in statement:
            strTime = datetime.datetime.now().strftime("%H:%M:%S")
            speak(f"the time is {strTime}")

        elif 'who are you' in statement or 'what can you do' in statement:
            speak('I am G-one version 1 point O your persoanl assistant. I am programmed to minor tasks like opening youtube,googlechrome,gmail and stackoverflow ,predict time,take a photo,searchwikipedia,predict weather in different cities, get top headline news from times of india and you can ask me computational or geographical questions too!')

        elif "who made you" in statement or "who created you" in statement or "who discovered you" in statement:
            speak("I was built by AKSHAY PRAKASH")
            print("I was built by AKSHAY PRAKASH")

        elif "open stackoverflow" in statement:
            webbrowser.open_new_tab("https://stackoverflow.com/login")
            speak("Here is stackoverflow")

        elif 'news' in statement:
            webbrowser.open_new_tab("https://timesofindia.indiatimes.com/home/headlines")
            speak('Here are some headlines from the Times of India, Happy reading')
            time.sleep(6)

        elif "camera" in statement or "take a photo" in statement:
            ec.capture(0, "robo camera", "img.jpg")

        elif 'search' in statement:
            statement = statement.replace("search", "")
            webbrowser.open_new_tab(statement)
            time.sleep(5)

        elif 'ask' in statement:
            speak('I can answer to computational and geographical questions and what question do you want to ask now')
            question = takeCommand()
            app_id = "R2K75H-7ELALHR35X"
            client = wolframalpha.Client('R2K75H-7ELALHR35X')
            res = client.query(question)
            answer = next(res.results).text
            speak(answer)
            print(answer)

        elif "log off" in statement or "sign out" in statement:
            speak("Ok, your pc will log off in 10 sec make sure you exit from all applications")
            subprocess.call(["shutdown", "/l"])
            time.sleep(3)

VIII. FUTURE IMPROVEMENTS

IX. CONCLUSION

The purpose of this research was to identify effective strategies for dealing with repetitive motions identified in individuals with autism spectrum disorder. Based on the analysis conveyed, it can be concluded that there are multiple behaviour modification therapies important for the improvement of this behaviour. Future exploration into behaviour modification techniques could be useful to finding further therapy techniques. The amount this could improve the lives of others with repetitive motion behaviour's is worth exploring.

Everybody loves Talking Robots. It's like having a little pal to mess around with. If the robot is humanoid then it's more fun than ever. Among all other robots, humanoid robots suddenly become a 'he/she' instead of 'it'. Having a humanoid talking robot feels like awesome, but talking robots are seemingly complicated to make.

To make a robot talk we can go through two methods Speech Synthesis and Pre-recorded audio. Among them option a doesn't perform understandably well with Arduino. So, we are going for method Pre-recorded audio.

Finally our wish has been full filled as an identity of our project and our robot working. We would like to thank everyone who belive us and support our project.

X. APPENDIX

9.1 Project Modeling Using Solid Works

Description of Fig 9.1.1 Starting Stage: A 3D rendering of the initial stage of the robot's design, showing a basic chassis with wheels and mounting points.

Description of Fig 9.1.2 Front View: A 3D rendering of the robot's front view, detailing its structure and components.

Description of Fig 9.1.3 Back View: A 3D rendering of the robot's back view, showing its construction and internal layout.

9.2 Project Images

Description of Fig 9.2.1 Front view: A photograph of the constructed robot's front view, showing its physical form.

Description of Fig 9.2.2 Side View: A photograph of the constructed robot's side view.

Description of Fig 9.2.3 Top View: A photograph of the constructed robot's top view.

REFERENCES

  1. Talking with a robot in ENGLISH (March 2005) L Stephen Coles
  2. Building a talking baby robot, a contribution to the study of speech acquisition and evolution (November 2007) J. Serkhane, J.L. Schwartz, P. Bessière ICP, Grenoble Laplace-SHARP, Gravir, Grenoble
  3. A Talking Robot and the Expressive Speech Communication with Human (December 2014) Hideyuki SAWADA
  4. Smart talking robot Xiaotu: participatory library service based on artificial intelligence (April 2015) Fei Yao, Chengyu Zhang and Wu Chen
  5. Speech Planning of an Anthropomorphic Talking Robot for Consonant Sounds Production (May 2002) Kazufumi Nishikawa, Akihiro Imai, Takayuki Ogawara, Hideaki Takanobu, Takemi Mochida, Atsuo Takanishi
  6. Multilingual WikiTalk: Wikipedia-based talking robots that switch languages (August 2013) Graham Wilcock and Kristiina Jokinen
  7. How Computers (Should) Talk to Humans (April 2006) Robert Porzel

PDF preview unavailable. Download the PDF instead.

Paper7971 Foxit Reader PDF Printer Version 6.0.3.0513

Related Documents

Preview BL20 Pro Robot Vacuum Quick Start Guide
Step-by-step instructions for setting up and connecting the BL20 Pro robot vacuum, including charging, accessory installation, and Wi-Fi configuration via the Smart Life app.
Preview W100 Smart Glasses: AI Bluetooth Translator and Audio Device Specifications
Detailed specifications for the W100 Smart Glasses, featuring AI translation, Bluetooth connectivity, voice assistant, and long battery life. Includes technical parameters, features, and package contents for this AI-powered wearable device.
Preview Open Source Internet of Things Platforms: A Survey
This paper surveys and analyzes various open-source software platforms for the Internet of Things (IoT), detailing their features, protocols, and licensing. It aims to guide researchers and developers in selecting suitable IoT platforms.
Preview Convolutional Neural Networks for Speech Recognition
This research paper explores the application of Convolutional Neural Networks (CNNs) to enhance Automatic Speech Recognition (ASR) systems, detailing their architecture, a novel limited weight sharing scheme, and experimental results showing improved performance over DNNs and GMM-HMMs.
Preview AI Safety: Modeling Adversarial Behavior Propagation in AI Agents Using Epidemiological Models and PINNs
This research paper explores the application of epidemiological modeling, specifically the SEIR compartment model, combined with Physics-Informed Neural Networks (PINNs), to understand and manage systemic risks associated with adversarial behavior propagation in large-scale AI agent deployments. It analyzes vulnerabilities, estimates parameters from real-world data, and simulates attack scenarios across various sectors, providing a framework for proactive AI safety.
Preview External Knowledge Augmented Language Models for Code Generation and Agents: Thesis Proposal
This thesis proposal by Fangzheng (Frank) Xu investigates the integration of external knowledge into language models for enhanced code generation and the development of AI agents. It covers pre-training, human studies, retrieval augmentation, and LLM agent applications, aiming to improve natural language interaction with computers.
Preview Robotic Vacuum Cleaner Technical Specifications
Detailed technical specifications for a robotic vacuum cleaner, including dimensions, performance metrics, battery life, and operational modes.
Preview Controlling Electrical Home Appliances with Bluetooth Smart Technology
This thesis investigates Home Automation and Energy Management systems, emphasizing the use of Bluetooth Low Energy (BLE) for smart home appliance control. It details the system architecture, prototype development, and experimental validation, aiming to improve home comfort, security, and energy efficiency.