I thought I'd share an update. I have finished the first iteration of the video/audio AI part of this solution. I have created the ability for the 'customer' to interact with the live audio/video AI so it can recognise a book via its ISBN in barcode:
Video and explanation in my LinkedIn post: https://www.linkedin.com/feed/update/urn:li:activity:7383263424652259328/
In more detail, it works by:
- Using the bowser's native ability to scan barcodes against the live webcam stream. This is better quality and with a quicker frame-rate than what my server receives to send to the AI
- Barcodes detected above are immediately sent to my server
- ISBN in barcode used to pre-emptively retrieve book details server-side, reducing latency in conversation
- Gemini Live API given a tool to 'read_barcode', making conversational agent sound like it is indeed reading the barcode. But tool returns result of pre-emptive search
What's next:
Next I want to focus on having a Digital Bot in Genesys Cloud trigger the audio/video appearing for the customer, and receive scanned books it can use in the chatbot's conversation.
------------------------------
Lucas Woodward
Winner of Orchestrator of the Year, Developer (2025)
LinkedIn -
https://www.linkedin.com/in/lucas-woodward-the-devNewsletter -
https://makingchatbots.com------------------------------
Original Message:
Sent: 10-07-2025 18:08
From: Lucas Woodward
Subject: Building an experimental AI-based CX journey
Last night I sketched out an experimental CX journey I'd like to create, in a LinkedIn post.
It is:
- Start a conversation with a text-based chatbot about selling my books on my mobile phone (aka cell phone)
- Have it transfer to an AI powered voice + video bot to discuss which books on my bookshelf to sell, and potential cost of each
- Once decided to have this information fed back into my existing chatbot session, to allow me to continue
The sketch I shared:
My main reason for doing this is to experiment with whether this is technically feasible using Web Messenger, and Google's Gemini Live API - two great technologies I'd love to see working together.
It'd be great to get thoughts from the community on how they'd approach this. I am thinking:
- Start with the official Web Messenger experience - rather than the Web Messenger Guest SDK
- This is for simplicities sake, although may require unacceptable workarounds
Use the Web Messenger's Database Plugin to allow the Inbound Message flow/chatbot to 'notify' the website when to start Gemini Live API I'll just use a Data Action for this actually.- The Live API will have 2 tools
- Search for ISBN by book title - it can use this when being shown a book
- Agree sale of book - this will cause the book's ISBN to be sent to the Web Messenger chatbot via the sendMessage command
- When customer/Live API finished the conversation in Web Messenger has the books (see sub-point above) and I can send a message to the conversation to prompt it to continue the Web Messenger based conversation.
I'm sure I will make many amendments (and mistakes) along the way, but does anyone see any glaring issues with my thinking?
#MobileMessenger
#PlatformSDK
------------------------------
Lucas Woodward
Winner of Orchestrator of the Year, Developer (2025)
LinkedIn - https://www.linkedin.com/in/lucas-woodward-the-dev
Newsletter - https://makingchatbots.com
------------------------------