A smart robot shall be able to know if a person is in front of it and communicate with the person using voice (talk). This involves speech to text and text to speech function as well as the facial recognition function.
In this project, we will program Yanshee using Python to:
- Recognise a person
- Complete tasks based on the voice instruction using Yanshee API
- Reading temperature sensors using Yanshee API
Required Knowledge
Please ensure that you have gone through the below resources before starting this project:
Introduction to YanAPI
YanAPI is an API for Python programming language. YanAPI provides the ability to use Python to obtain robot status information as well as design and control Yanshee robot. The API has pre-programmed the complicated AI functions and allows the programmer to call those functions by a single line of code.
Task 1: Recognising Faces and greeting
Step 1: Switch on Yanshee and connect to Yanshee using VNC viewer.
Step 2: Create your python file under and name it as <Your File Name>.py
Step 3: Initialise YanAPI by using the code below:
import YanAPI
import time
ip_addr = "127.0.0.1" # please change to your yanshee robot IP
YanAPI.yan_api_init(ip_addr)
The import command is to tell Python that this application will re-use the functions in the imported modules (YanAPI and time module). The ip_addr
parameter is the IP address of the robot that you are going to control. In most of the case, we set it to 127.0.0.1
for the curret robot itself. If you which to control more than one robot, you may need the IP address of the other robots. After that Call the YanAPI.yan_api_init
function to initialise the connection of the robot.
Step 4: Recognise the face by adding the code below
fr_res = YanAPI.sync_do_face_recognition("recognition")
if name_val != "" and name_val != "none":
print("\nDetected: ")
print(name_val)
YanAPI makes facial recognition super easy by just calling 1 function. If the face has been added to the database, the sync_do_face_recognition
returns the result of the recognition.
Step 5: Speak the greetings by calling TTS API
tts_res = YanAPI.start_voice_tts("Hi "+name_val+", what can I do for you?",False)
Yeah! It's done! Without writing hundreds lines of codes, you can use YanAPI to do Facial Recognition and Text To Speech with these few lines of codes.
Task 2: Two-way communication by Listening and Speaking
In this task, we program Yanshee to listen to the human voice command to add new face to the database, and get the name of the person by asking a question, finally, take a photo of the person in front and execute the adding facial function based on the information it gets.
Step 1: Create a function to capture image and add to the facial recognition database
def input_face_sample(name):
tts_res = YanAPI.start_voice_tts("Taking photo, cheese",False)
#take a photo
res = YanAPI.take_vision_photo()
print(res)
if(res["code"] == 0):
#retrieve photo image
path = "/tmp/"
YanAPI.get_vision_photo(res["data"]["name"], path)
photo = path + res["data"]["name"]
photo_name = res["data"]["name"]
#upload to FR database
YanAPI.upload_vision_photo_sample(photo)
#put link image with name
YanAPI.set_vision_tag([photo_name],name)
else:
print(res["msg"])
Step 2: Listen to the voice and translate it into Text.
listen_res = YanAPI.sync_do_voice_asr()
Step 3: To check if user said the "new face" comment
if len(listen_res["data"]) > 0:# user said something
question=listen_res["data"]['intent']['answer']['question']['question']
if question.lower().strip()=="new face": #to check if the user said "new face"
Step 4: Response to the user and ask user the name of the face.
tts_res = YanAPI.start_voice_tts("Sure, what is the name of the new face?",False)
time.sleep(2) #sleep 2 seconds to ensure the voice will not be interapt by the code below
listen_res = YanAPI.sync_do_voice_asr() # listen again to get the name.
name=listen_res["data"]['intent']['answer']['question']['question'];
tts_res = YanAPI.start_voice_tts("Thank you. "+name+", please face my camera.",False)
time.sleep(5);
Step 5: Once the name is captured, detect the number of face in camera to ensure only 1 person is captured and call the function created by Step 1
res=YanAPI.sync_do_face_recognition('quantity')
face_quantity=res['data']['quantity'];
if face_quantity==1:
input_face_sample(name) #call the function created by step 1
else:
tts_res = YanAPI.start_voice_tts("Sorry, only 1 face is allowed. Adding new face cancelled",False)