Design and Fabrication of Talking Robot
Authors: Akshay Prakash J S, Sentahmizh Chittu D, Nandhini V, Abhishek N, Mr. G. Chandrasekar
Affiliation: Dhanalakshmi Srinivasan Engineering College (Autonomous), Perambalur, India
Abstract
This paper presents the design and fabrication of a mini talking robot. The project aims to create an artificial intelligence voice assistant robot using pre-recorded audio for speech. The goal is to develop a robot with a low manufacturing cost, incorporating advanced inventions. The paper discusses the current work, focusing on the challenges of grounding user utterances in the robot's environment, leveraging advancements in speech technologies and the potential for vocal interfaces in consumer robots.
Keywords
Robot
I. INTRODUCTION
A talking robot is a type of robot capable of producing speech or other vocalizations for communication. These robots can be programmed to speak in various languages and are used in applications like customer service, language learning, and entertainment. They can be controlled by computer programs or human operators, utilizing natural language processing to understand and respond to human speech. Talking robots employ text-to-speech (TTS) technology to convert text into spoken words, enabling natural communication. Applications span customer service, education, and entertainment, with development extending to healthcare and other industries requiring human-like interaction. Examples include personal assistants like Amazon's Alexa and Google Home, and advanced robots like Hanson Robotics' Sophia.
II. LITERATURE SURVEY
- Talking with a robot in ENGLISH (March 2005) L Stephen Coles. Review: Future plans for a research project on natural language communication with an intelligent automaton.
- Building a talking baby robot, a contribution to the study of speech acquisition and evolution (November 2007) J. Serkhane, J.L. Schwartz, P. Bessière ICP, Grenoble Laplace-SHARP, Gravir, Grenoble. Review: Provides a natural computational modelling framework for cognitive robotics and speech robotics, based on embodiment, multimodality, development, and interaction.
- A Talking Robot and the Expressive Speech Communication with Human (December 2014) Hideyuki SAWADA. Review: Introduces a talking robot with mechanically constructed human-like vocal cords and vocal tract, with specified pitches and phonemes.
- Smart talking robot Xiaotu: participatory library service based on artificial intelligence (April 2015) Fei Yao, Chengyu Zhang and Wu Chen. Review: Discusses a project integrating mobile and social networking environments with computer technologies for a user-centred participatory service, featuring a modularized architecture for sharing.
- Speech Planning of an Anthropomorphic Talking Robot for Consonant Sounds Production (May 2002) Kazufumi Nishikawa, Akihiro Imai, Takayuki Ogawara, Hideaki Takanobu, Takemi Mochida, Atsuo Takanishi. Review: Describes the development of the WT-1R robot, improved for natural vowel and consonant sound production, proposing speech planning for complex consonant sounds.
- Multilingual WikiTalk: Wikipedia-based talking robots that switch languages (August 2013) Graham Wilcock and Kristiina Jokinen. Review: Presents a talking robot demonstrating Wikipedia-based spoken information access in English, and a new demo showing multilingual capabilities and language switching.
- How Computers (Should) Talk to Humans (April 2006) Robert Porzel. Review: Focuses on dialogue systems research in human-computer interaction, highlighting improvements in natural language generation and synthesis, and the design of computers as interlocutors.
- Artificial Intelligence-based Voice Assistant (October 2020) S Subhash, Prajwal N Srivatsa, S Siddesh, A Ullas, B Santhosh. Review: Discusses voice control as a growing feature, with AI-based voice assistants used in smartphones and laptops for recognizing human voice and responding via integrated voices.
- VOICE ASSISTANT - A REVIEW (March 2021) Shabdali Suresh Shetty. Review: Covers a voice-activated personal assistant developed using Python, performing tasks like weather updates, music streaming, and Wikipedia browsing, with current systems limited by network connectivity.
- Alexa-Based Voice Assistant for Smart Home Applications (July 2021) Asuncion Santamaria, Guillermo del Campo, Edgar Saavedra and Clara Jimenez. Review: Highlights the role of the Internet of Things (IoT) in creating smart environments and the growth of IoT devices for smart buildings and cities.
III. EXISTING SYSTEM AND LIMITATIONS
3.1 Existing Model
Current AI voice assistant robots include Amazon's Echo (Alexa), Google Home (Google Assistant), and Apple's Home Pod (Siri), designed for voice commands and tasks like playing music and setting reminders. They integrate with smart home devices like thermostats and lights. Other manufacturers include Sonos, Harman Kardon, and JBL. These robots are often integrated into smart speakers or other devices. Standalone AI voice assistant robots like Jibo and Kuri also exist. The Sofia robot by Hanson Robotics is a humanoid AI robot capable of recognizing and responding to human emotions, used in customer service, education, and entertainment, featuring expressive features for natural interaction.
3.2 Limitation of Existing Models
Existing voice assistant robots have several limitations:
- Limited understanding: Difficulty with certain accents, dialects, or specific language nuances.
- Limited knowledge: May not have access to all human knowledge, limiting answer accuracy.
- Privacy concerns: Issues regarding the security and privacy of collected personal data.
- Limited tasks: Inability to perform all human tasks like cooking or cleaning.
- Internet connectivity: Dependency on stable internet connection for functionality.
- Dependency: Potential decrease in human memory, focus, or simple task performance due to overreliance.
- Cost: Advanced voice assistant robots can be expensive and less accessible.
The project focuses on minimizing the building cost of the robot.
IV. PROPOSED SYSTEM
The proposed project is a talking robot capable of producing speech or spoken words through text-to-speech synthesis, pre-recorded speech, or a combination. Talking robots find applications in customer service, language learning, and entertainment. Options for building a budget-friendly talking robot include using a Raspberry Pi as the central unit, connected to a microphone and speaker, and utilizing open-source software like Python and the Open Speech Platform. Alternatively, a microcontroller like Arduino with a pre-built speech recognition module can be used. The cost of materials varies by design, but both options are relatively inexpensive compared to commercial alternatives. The primary focus of this project is to minimize robot expense.
V. HARDWARE COMPONENTS
5.1 Raspberry Pi 4
The Raspberry Pi is a series of small single-board computers developed in the UK. Initially intended for teaching computer science, it gained popularity for uses like robotics due to its low cost, modularity, and open design. It is commonly used by computer and electronics hobbyists.
Description of Fig 5.1.1 RASBERRY PI 4: A photograph of the Raspberry Pi 4 single-board computer, showing its compact size and various ports and connectors.
- SoC: Broadcom BCM2711B0 quad-core A72 (ARMv8-A) 64-bit @ 1.5GHz
- Networking: 2.4 GHz and 5 GHz 802.11b/g/n/ac wireless LAN
- RAM: 1GB, 2GB, or 4GB LPDDR4 SDRAM
- Bluetooth: Bluetooth 5.0, Bluetooth Low Energy (BLE)
- GPIO: 40-pin GPIO header, populated
- Storage: microSD
- Dimensions: 88 mm × 58 mm × 19.5 mm, 46 g
5.2 Arduino Uno
The Arduino Uno is an open-source microcontroller board based on the ATmega328P microcontroller. It features digital and analog input/output pins for interfacing with various expansion boards and circuits. It is programmable via the Arduino IDE using a USB cable.
Description of Fig 5.2.1 ARDUINO UNO: A photograph of the Arduino Uno microcontroller board, highlighting its digital and analog pins, USB port, and ATmega328P chip.
- Microcontroller: Microchip ATmega328P
- Operating Voltage: 5 Volts
- Input Voltage: 7 to 20 Volts
- Digital I/O Pins: 14
5.3 L298N Motor Driver
The L298N Motor Driver Module is a high-power module for driving DC and Stepper Motors. It contains an L298 motor driver IC and a 78M05 5V regulator, capable of controlling up to 4 DC motors or 2 DC motors with directional and speed control using an H-Bridge.
Description of Fig 5.3.1 L298N Motor Driver: A photograph of the L298N Motor Driver module, a circuit board designed to control DC motors.
- Logical voltage: 5V
- Drive voltage: 5V-35V
- Logical current: 0-36Ma
- Drive current: 2A (max. single bridge)
- Max power: 25W
- Motor Supply Voltage (Maximum): 46V
- Motor Supply Current (Maximum): 2A
5.4 Geared DC Motor
A geared DC motor has a gear assembly attached to increase torque and reduce speed. It features a 3 mm threaded drill hole in the shaft for easy connection to wheels or other mechanical assemblies.
Description of Fig 5.4.1 GEARED MOTOR: A photograph of a geared DC motor, showing the motor unit attached to a gearbox with a yellow wheel.
- Torque: 2 kg-cm
- Operating Voltage: 12V DC
- Gearbox: Attached Metal(spur)Gearbox
- Shaft diameter: 6mm with internal hole
- No-load current: 60 mA (Max)
- Load current: 300 mA (Max)
5.5 Servo Motor
A servo motor is a rotary actuator allowing precise control of angular position, velocity, and acceleration. It includes a motor coupled with a sensor for position feedback, offering good torque, holding power, and fast updates.
Description of Fig 5.5.1 SERVO MOTOR MG995: A photograph of a servo motor, model MG995, commonly used for precise angular control in robotics.
- Model: MG90S
- Operating voltage: 4.8V to 6V
- Recommended voltage: 5V
- Stall torque: 1.8 kg/cm (4.8V)
- Max stall torque: 2.2 kg/cm (6V)
- Gear type: Metal
- Rotation: 0°-180°
5.6 Ultrasonic Sensor
An ultrasonic sensor is an electronic device that emits ultrasonic sound waves and converts reflected sound into an electrical signal to determine distance. Ultrasonic waves travel faster than audible sound. It consists of a transmitter and a receiver.
Description of Fig 5.6.1 Ultrasonic Sensor: A photograph of the HC-SR04 ultrasonic sensor module, showing its two transducers (transmitter and receiver) and circuit board.
- Sensing range: 40 cm to 300 cm
- Response time: 50 milliseconds to 200 milliseconds
- Operating voltage: 20 VDC to 30 VDC
- Preciseness: ±5%
- Resolution: 1mm
- Sensor output voltage: 0 VDC – 10 VDC
- Target dimensions for max distance: 5 cm × 5 cm
5.7 Speaker
Speakers are output devices used to generate sound, connecting to computers or sound systems. They convert electromagnetic waves into sound waves.
Description of Fig 5.7.1 SPEAKER: Zebronics Pluto speaker: A photograph of a pair of Zebronics Pluto speakers.
- Output Power: 5W (2.5W x 2)
- Driver Size: 52mm (2.04") x 2
- Impedance: 3Ω
- Frequency response: 120Hz-15kHzS/N
- Ratio: ≥60dB
- Separation: ≥50Db
- Line input: 3.5mm jack
5.8 Mic
A microphone captures audio by converting sound waves into electrical signals. The first electronic microphone used a liquid mechanism with a diaphragm.
Description of Fig 5.8.1 Mic: A photograph of a microphone with a stand, designed for audio input.
- Features: Plug and play, High quality recording, USB port to start using, Fast and convenient.
5.9 Battery
This battery pack uses ICR 18650 2500mAh 20C Lithium-Ion Batteries and a BMS circuit, offering small size and weight. It includes charge protection for direct charging via a DC adapter.
Description of Fig 5.9.1 BATTERY: A photograph of a Lithium-Ion battery pack, labeled with a warning, indicating its specifications.
- Battery specification: 18650/11.1V/2600mAh
- Termination voltage: 8.25V
- Charging temperature: 10-45°C
VI. WORKING
6.1 Circuit Diagram
Description of FIG 6.1a ROBOT MOTION CONTROLER: A circuit diagram illustrating the connections between the Arduino Uno, L298N Motor Driver, motors, ultrasonic sensor, and power source for robot motion control.
Description of FIG 6.1b VOICE ASSISTANT: A block diagram showing the components of the voice assistant system, including Raspberry Pi, speaker, microphone, cloud services (Google Assistant, Dialogflow), and TTS engine, connected to the robot's systems.
6.2 Talking System
The talking system is a primary system for the robot, powered by a Raspberry Pi 4 and a 12V DC supply. The Raspberry Pi 4, speaker, and microphone are integrated into the robot's body. The system allows users to ask questions, which are processed by searching the internet. Audio is converted to text, then to audio output.
6.2.1 Key Points
- Directly connects with the internet.
- Raspberry Pi 4 with 4GB RAM ensures efficiency.
- Fast internet connectivity provides quick replies.
6.3 Motion Control
The robot walks automatically using an ultrasonic sensor, enabling it to detect and avoid obstacles, making decisions autonomously. This feature makes it a smart, autonomous robot. The project also covers the use of ultrasonic sensors and serial monitors.
VII. SAMPLE CODE
The following Python code demonstrates a basic AI personal assistant named 'G-One'. It includes functions for wishing the user, taking voice commands, searching Wikipedia, opening web pages (YouTube, Google, Gmail), and retrieving weather information.
import speech_recognition as sr
import pyttsx3
import datetime
import wikipedia
import webbrowser
import os
import time
import subprocess
from ecapture import ecapture as ec
import wolframalpha
import json
import requests
print('Loading your AI personal assistant - G One')
engine=pyttsx3.init('sapi5')
voices=engine.getProperty('voices')
engine.setProperty('voice','voices[0].id')
def speak(text):
engine.say(text)
engine.runAndWait()
def wishMe():
hour = datetime.datetime.now().hour
if hour >= 0 and hour < 12:
speak("Hello, Good Morning")
print("Hello, Good Morning")
elif hour >= 12 and hour < 18:
speak("Hello, Good Afternoon")
print("Hello, Good Afternoon")
else:
speak("Hello,Good Evening")
print("Hello,Good Evening")
def takeCommand():
r = sr.Recognizer()
with sr.Microphone() as source:
print("Listening...")
audio = r.listen(source)
try:
statement = r.recognize_google(audio, language='en-in')
print(f"user said: {statement}\n")
except Exception as e:
speak("Pardon me, please say that again")
return "None"
return statement
speak("Loading your AI personal assistant G-One")
wishMe()
if __name__ == '__main__':
while True:
speak("Tell me how can I help you now?")
statement = takeCommand().lower()
if statement == 0:
continue
if "good bye" in statement or "ok bye" in statement or "stop" in statement:
speak('your personal assistant G-one is shutting down,Good bye')
print('your personal assistant G-one is shutting down, Good bye')
break
if 'wikipedia' in statement:
speak('Searching Wikipedia...')
statement = statement.replace("wikipedia", "")
results = wikipedia.summary(statement, sentences=3)
speak("According to Wikipedia")
print(results)
speak(results)
elif 'open youtube' in statement:
webbrowser.open_new_tab("https://www.youtube.com")
speak("youtube is open now")
time.sleep(5)
elif 'open google' in statement:
webbrowser.open_new_tab("https://www.google.com")
speak("Google chrome is open now")
time.sleep(5)
elif 'open gmail' in statement:
webbrowser.open_new_tab("gmail.com")
speak("Google Mail open now")
elif "weather" in statement:
api_key = "8ef61edcf1c576d65d836254e11ea420"
speak("whats the city name")
city_name = takeCommand()
base_url = "https://api.openweathermap.org/data/2.5/weather?"
complete_url = base_url + "appid=" + api_key + "&q=" + city_name
response = requests.get(complete_url)
x = response.json()
if x["cod"] != "404":
y = x["main"]
current_temperature = y["temp"]
current_humidiy = y["humidity"]
z = x["weather"]
weather_description = z[0]["description"]
speak(f" Temperature in kelvin unit is {current_temperature} \n humidity in percentage is {current_humidiy} \n description {weather_description})")
print(f" Temperature in kelvin unit = {current_temperature} \n humidity (in percentage) = {current_humidiy} \n description = {weather_description})")
else:
speak(" City Not Found ")
elif 'time' in statement:
strTime = datetime.datetime.now().strftime("%H:%M:%S")
speak(f"the time is {strTime}")
elif 'who are you' in statement or 'what can you do' in statement:
speak('I am G-one version 1 point O your persoanl assistant. I am programmed to minor tasks like opening youtube,googlechrome,gmail and stackoverflow ,predict time,take a photo,searchwikipedia,predict weather in different cities, get top headline news from times of india and you can ask me computational or geographical questions too!')
elif "who made you" in statement or "who created you" in statement or "who discovered you" in statement:
speak("I was built by AKSHAY PRAKASH")
print("I was built by AKSHAY PRAKASH")
elif "open stackoverflow" in statement:
webbrowser.open_new_tab("https://stackoverflow.com/login")
speak("Here is stackoverflow")
elif 'news' in statement:
webbrowser.open_new_tab("https://timesofindia.indiatimes.com/home/headlines")
speak('Here are some headlines from the Times of India, Happy reading')
time.sleep(6)
elif "camera" in statement or "take a photo" in statement:
ec.capture(0, "robo camera", "img.jpg")
elif 'search' in statement:
statement = statement.replace("search", "")
webbrowser.open_new_tab(statement)
time.sleep(5)
elif 'ask' in statement:
speak('I can answer to computational and geographical questions and what question do you want to ask now')
question = takeCommand()
app_id = "R2K75H-7ELALHR35X"
client = wolframalpha.Client('R2K75H-7ELALHR35X')
res = client.query(question)
answer = next(res.results).text
speak(answer)
print(answer)
elif "log off" in statement or "sign out" in statement:
speak("Ok, your pc will log off in 10 sec make sure you exit from all applications")
subprocess.call(["shutdown", "/l"])
time.sleep(3)
VIII. FUTURE IMPROVEMENTS
- To future extension, we decided to control our robot using Bluetooth controlled joystick.
- Change the robot body cardboard into 3D material.
- We plan to improve movement of robot.
- We decided to implement a display on robot.
- Implement a high-quality speaker and mic.
- Using PCBs for compactness, low signals noises and better integration. Faster real time processing with the use of private servers.
IX. CONCLUSION
The purpose of this research was to identify effective strategies for dealing with repetitive motions identified in individuals with autism spectrum disorder. Based on the analysis conveyed, it can be concluded that there are multiple behaviour modification therapies important for the improvement of this behaviour. Future exploration into behaviour modification techniques could be useful to finding further therapy techniques. The amount this could improve the lives of others with repetitive motion behaviour's is worth exploring.
Everybody loves Talking Robots. It's like having a little pal to mess around with. If the robot is humanoid then it's more fun than ever. Among all other robots, humanoid robots suddenly become a 'he/she' instead of 'it'. Having a humanoid talking robot feels like awesome, but talking robots are seemingly complicated to make.
To make a robot talk we can go through two methods Speech Synthesis and Pre-recorded audio. Among them option a doesn't perform understandably well with Arduino. So, we are going for method Pre-recorded audio.
Finally our wish has been full filled as an identity of our project and our robot working. We would like to thank everyone who belive us and support our project.
X. APPENDIX
9.1 Project Modeling Using Solid Works
Description of Fig 9.1.1 Starting Stage: A 3D rendering of the initial stage of the robot's design, showing a basic chassis with wheels and mounting points.
Description of Fig 9.1.2 Front View: A 3D rendering of the robot's front view, detailing its structure and components.
Description of Fig 9.1.3 Back View: A 3D rendering of the robot's back view, showing its construction and internal layout.
9.2 Project Images
Description of Fig 9.2.1 Front view: A photograph of the constructed robot's front view, showing its physical form.
Description of Fig 9.2.2 Side View: A photograph of the constructed robot's side view.
Description of Fig 9.2.3 Top View: A photograph of the constructed robot's top view.
REFERENCES
- Talking with a robot in ENGLISH (March 2005) L Stephen Coles
- Building a talking baby robot, a contribution to the study of speech acquisition and evolution (November 2007) J. Serkhane, J.L. Schwartz, P. Bessière ICP, Grenoble Laplace-SHARP, Gravir, Grenoble
- A Talking Robot and the Expressive Speech Communication with Human (December 2014) Hideyuki SAWADA
- Smart talking robot Xiaotu: participatory library service based on artificial intelligence (April 2015) Fei Yao, Chengyu Zhang and Wu Chen
- Speech Planning of an Anthropomorphic Talking Robot for Consonant Sounds Production (May 2002) Kazufumi Nishikawa, Akihiro Imai, Takayuki Ogawara, Hideaki Takanobu, Takemi Mochida, Atsuo Takanishi
- Multilingual WikiTalk: Wikipedia-based talking robots that switch languages (August 2013) Graham Wilcock and Kristiina Jokinen
- How Computers (Should) Talk to Humans (April 2006) Robert Porzel