Technical Updates Archives -

Eleven Labs (EL) have just introduced a faster TTS model, Flash 2.5, which is specifically geared towards conversational agents.
See their announcement here –
https://www.youtube.com/watch?v=0YmHnkTVkFA

According to EL, speed is faster & quality is lower.

We reviewed the new model and can confirm that in the SitePal environment overall latency is reduced by 15% to 20% on average for English. For non-English improvement is much greater – with average latency reduced by 50% to %60.

As the new model is multilingual it can be used for all supported languages, which is why latency for non-English is more significantly improved. Until now, non-English input was processed using the slower multilingual model.

We were not able to audibly detect loss of quality. We concluded that the difference in quality is not meaningful in an online conversation scenario.

We have therefore modified the API to use the new Flash v2.5 model as its default model for EL, which means it will be used if you do not specify a different model when calling the API.

This update is now implemented. There is nothing you need to do if you use the default engine.

To review for yourself – check here – https://elevenlabs.io/app/speech-synthesis/text-to-speech (select model at top right)

If you prefer not to use the new default model, specify the model name in the xdata1 parameter when calling sayText or sayAI. To review this and other options for fine tuning EL audio generation, see details in the SitePal API reference. Check out the parameters for the sayText or sayAI functions & look for ‘xdata1’.

New model:
model_id=eleven_flash_v2_5

Previously used model – for English:
model_id=eleven_turbo_v2

Previously used model – for non-English:
model_id=eleven_multilingual_v2

Eleven Labs (EL) TTS can be integrated with SitePal by adding your EL API key to your SitePal ‘Connect’ page, and is one of several 3rd party TTS providers available for use with SitePal Avatars, to complement the built in TTS voices. Using 3rd party TTS requires the Platinum Plan.

Hello SitePal Fans,

In this update we will cover –

TTS latency optimization
API access to raw audio data
JavaScript frameworks support
Eleven Labs audio: lipsync fix

TTS Latency Optimization – we’ve recently come to realize that TTS response times have deteriorated due to increasing levels of usage. Both additional load and accumulated data have had a detrimental effect over time.
We’ve just implemented an update to address the problem. With this update response times are now faster than they have ever been. In most use cases you should find that latency has been reduced by over 50%.

JavaScript Framework Examples – We’ve added new framework examples for ReactJS, NextJS, Angular and Vue. This should prove helpful as customers have repeatedly run into issues that were not covered by our initial technical example. We’ve identified a set of five examples that together cover most envisioned scenarios.

The following examples are now available for each framework:

Technical example: Dynamic TTS, page navigation, receiving callbacks
Responsive example: Responsive embed in framework
Conversation Example: Multiple avatars, the use of Portals.
AI Text Example: Using the sayAI API with text input
AI Audio Example Using the sayAI API with audio input

All the examples come with full source code, and are available on the SitePal support page.

Access Audio Raw Data – Note: this feature is available to Platinum and Integrator Plan customers.

We’ve added a new API function ‘getAudioObject’ that enables access to the raw audio data as it is being played. To understand why this feature might be useful, let us share the scenario that brought this about.

In an AI Agent implementation our customer preferred to have an active mic at all times to allow users to interrupt the avatar. They used echo cancellation to eliminate avatar speech from the input & required access to the realtime audio data to make this work.

We’re sure customers will find other equally cool and unexpected ways to take advantage of this feature.

Eleven Labs Lip Sync Fix – due to special characteristics of Eleven Labs audios they were not being correctly processed, causing lipsync to be sub-par for Eleven Labs audios. This problem was identified and fixed on Aug 9^th.

Also, if you’ve missed it please see our recent announcement on this blog regarding Chatbase Pre-Integration. We’re very excited about it.

Of our future plans & projects perhaps worth mentioning at this time are the following –

Flutter Support – we’re looking to add Flutter support and Flutter code examples to our support materials. This should be available in the next few weeks.

Photoface 3D API – the intention is to enable customers to enable their end users or customers with a Photoface like tool, that end users could use to create & use their own avatars. This feature will be available to Integrator Plan customers only. This project is in the planning phase. Please contact us if you have an interest in this capability.

That’s it for the moment. We hope that you have found this information useful. Please contact us with any comments or questions. We’d love to receive your feedback.

Warm Regards,

The SitePal Team

www.sitepal.com

“What does your site say?”