Integrating Audio/Video calls into your application — Twilio, Agora, Zoom, LiveKit
February 17, 2023
Experience-based comparison of leading audio and video call conferencing providers based on integration complexity, time, and pricing.
Here I would like to share our experience integrating audio/video calls into LiveBoard, an all-in-one online tutoring platform. LiveBoard is an excellent example of audio-video integration and real-time communication application since it has mobile and web applications and requires 1-on-1 and group video conferencing calls with audio/video recording.
Before going into the full details, let’s describe our journey, and I will share every problem we have faced with integrations. I hope this article will help other founders to avoid the mistakes we have made.
Understanding main requirements and concepts
Before comparing different providers, let’s understand the main requirements and learn some key concepts.
- API/SDK availability and integration complexity — first of all, we should understand if there is an SDK for WEB, iOS, and Android platforms and the ease of integration.
- UI flexibility is another Key requirement since some providers may give you UI components that are not designed to be customized, and there is no way to embed them into your application UI smoothly.
- Recording — audio/video call recording on the server and accessing them via the API. Some providers claim they have a recording, but it seems you need to record it on the client side, and in case the browser is closed or crashed, the recording will be lost.
- Pricing is the most critical requirement in this industry since the price calculation could be very surprising for newbies.
It is also important to mention that our goal was to have a deep integration when users interact inside LiveBoard only. Many applications choose the easy path by generating Zoom links and opening Zoom on another tab or placing the Zoom app inside an iFrame, in both cases losing control and user experience.
Learning the pricing calculation puzzle
This part could be shocking for newbies in this sphere. While you can see that Google Meet has unlimited free group calls, Zoom provides up 40-minute free video conferencing, or $14/month fully unlimited plans. The API integration pricing works entirely differently. There are different pricing components in the final cost. Let’s explore the common ones:
- you pay for each minute for every participant in the call
- most of the providers also charge extra for conference calls again for each participant
- there is another cost for recording per minute
- also, there will be a cost for storing recordings on the server
This means, for example, the cost for a 10-person group audio call calculation will have the following structure:
(number of minutes) x (number of people) x (1min call cost+ 1min conference cost) + (number of minutes) x (1 min recording cost)
And this is only for audio-call calculation. The video call pricing calculation is more complicated since video call also depends on video resolution, quality, and the number of visible video frames.
Now that we have defined the main requirements and know how the pricing calculation generally works, let me walk you through LiveBoard’s journey with real-life examples.
Twilio — was our first choice
As we just started, we researched the market and found Twilio as one of the most mature and well-documented providers. It also has the necessary mobile SDKs and the ability for server recording. Twilio has a wide range of products/APIs, and for our needs, we selected Programmable Voice API.
It took around two months to integrate Twilio on all platforms (iOS, Android, WEB). We have implemented multi-user audio conferences with no Video calls. The API documentation was pretty good, we had no problem with integration, and call quality was also acceptable. Although there were some delays in joining the conference, it was acceptable.
This was before Covid, we have the first early adopters using the audio call feature, and everything seems to be ok. But when Covid hit and schools in Italy moved to remote in one day, we got a vast usage spike, and our audio integration problems started. We immediately realized that pricing was not acceptable.
Based on the above pricing calculation formula let’s do a quick price calculation for the following use case. The school teacher from Italy is using LiveBoard for group teaching with 10 students, 3 times a day with 60min sessions.
We need to look at Twilio’s Programmable Voice pricing
The audio call cost is $ 0.0040 per user/per min.
Extra conference API charge is: $ 0.0018 per user/per min
The audio conference cost for a 10-people group will be: $0.058 per min
This makes 60 min 1 session cost: $3.48
And this means that for a single school teacher with three lessons a day, which makes 60 lessons per month, we were paying more than $200 per month for the Voice call only without any taxes, recording, and other server costs.
It was evident that Twilio was a deal-stopper for us, and we started looking for alternatives. Of course, Twilio has some volume-based discounts, but in this case, even a 30–40% discount couldn’t fit into our business model.
Agora — next step for LiveBoard’s voice API
After doing full research on the market (considering all top alternatives), we have decided to stop on Agora, which again has all necessary SDKs, has good documentation, and pricing for Audio conference was more affordable than Twilio.
With our already extensive expertise in Audio/video integration, it took about one month to move all applications from Twilio to Agora. The documentation wasn’t perfect, but good enough to implement.
It has a similar price calculation mechanism per user per min and provides 10,000 monthly free minutes as credit.
Let’s do the same use case calculation for Agora:
The audio call cost is: $ 0.99 per user/for 1000 min or $0.00099 per min
There was no extra conference API charge
The audio conference cost for a 10-people group was: $0.0099 per min
Which makes 60 min 1 session cost: $0.594
Which was about five times cheaper than Twilio. For some period, it was an acceptable solution for us, though again, not a great match with our business model since, for typical users, the voice call was the highest cost in our pricing.
The main problem with Agora was their high Video API pricing. So, when we decided to add video calls to LiveBoard, we again started our new research for solutions.
Zoom — the next obvious choice
This was already the post-covid era when Zoom was super popular, and of course, we give another look to Zoom API. As you can imagine that Zoom should be super affordable since it has big free plans and just a $ 14-month unlimited professional plan. Again, you won’t be surprised anymore, but its fully integrated API with per-minute pricing was even higher than all previous solutions.
It could seem very strange, but the host paying 14$ per month with group conference size and minutes, easily could exceed $1000 per month with the same usage within LiveBoard integration.
That is why we have decided to go with Zoom semi-integration when we have integrated Zoom into LiveBoard using widgets, but still require the host user to connect their account to start using the video calls.
The only difference from full integration was the requirements to connect your account. We still used ZOOM mobile and WEB SDKs for integration instead of redirecting to the external Zoom application. But it was a baggy solution, and we faced multiple limitations and UI problems during integration. And during the six months we were actively using Zoom, our problems still needed to be addressed by the Zoom development team.
Publishing to Zoom Marketplace. To be able to manage user zoom accounts, you need to publish your application in their marketplace. While you can quickly create a draft application for testing the integration (with visible limitations), it will take up to a month to publish the app for production use, as it needs to pass a range of reviews regarding content, security, etc. Each review can take up to a week; after every rejection, you must make changes and wait again. In our case approval process took longer than the actual integration.
LiveKit — Self-Hosted Solution
After several months of continuous problems with Zoom, we decided to search for other solutions to move on. There were a lot of different services like Twilio and Agora out there, but we already knew their cons, the main one always being the cost. Eventually, we decided to try out self-hosted solutions. LiveKit started gaining popularity, and after comparing it with several other options, we decided to try it on our internal environment.
The integration took nearly two weeks to set up, and after successful internal testing, it was deployed to production. The price calculation for self-hosted solutions is more complex than with services like Twilio. It consists of two main parts:
- Servers where the solution is hosted
- Network bandwidth used for the actual meetings
In our case, servers are hosted on AWS, which has a vast range of different machines. A LiveKit instance hosted on a single AWS EC2 instance that costs ~$60 per month can manage up to 200 users simultaneously. The network cost is ~$40 per month with a high load. The final price of a video conference is $0.5 per single customer, which is way cheaper than any service we tried. It’s worth mentioning that there are hidden costs for this setup that are not so easily calculated.
Also, it is worth mentioning that LiveKit offers a solution to enable real-time analytics and telemetry of your WebRTC servers to help you analyze usage and end-to-end user experience in your application.
When using self-hosted solutions, the development team is responsible for resolving any problem. There is built-in scaling functionality, but your development team should set it up manually. Keep this in mind before going with this path.
There is no winning solution for all cases. Depending on your needs, you need to choose the right vendor. This would be my quick summary of different vendors’ best use cases:
- Use Twilio — when you use a wide range of their solutions. Twilio has an excellent infrastructure built with multiple APIs in one place. I will use it for getting toll-free numbers and SMS sending, but never for audio/video conferencing.
- Agora is one of the best audio/Video call integration choices. Also, its audio conference pricing is one of the most affordable in the market.
- Zoom — is an excellent choice when you use it as it is without trying to embed it into your application. I will always use Zoom if just sharing Zoom links in the application is enough and no API integration is necessary.
- LiveKit — or similar self-hosted solutions are your only choice when you need long group calls with multiple participants.
Share this post