Zoom Developer Forum

How to capture Zoom audio and video with a meeting bot

Updated at:
May 1, 2026
Written By:
Aydin Schwartz

Question

People commonly ask on the Zoom Developer Forum:

How can I build a bot that automatically joins Zoom meetings, captures raw audio and video streams, and processes this data for tasks such as transcription or analysis? What are the necessary steps to implement this using the Zoom SDK, and how do concepts like meeting sessions relate to the meeting ID or URL? Additionally, can the bot send audio back into the meeting, and what are the considerations for handling multiple concurrent meetings?

Answer

Building a bot that joins Zoom meetings and captures audio and video streams can be accomplished using the Zoom SDK, particularly the Raw Data functionality. Here’s a comprehensive guide based on community insights:

  1. Provision Infrastructure:
  • Set up a server environment using cloud services such as AWS, GCP, or DigitalOcean to host your bot instances.
  1. Join Meetings:
  • Use the Zoom Meeting SDK (available for Windows, macOS, and Linux) to programmatically launch an instance of the Zoom client and join the desired meeting. You will need to provide the meeting ID or URL for this purpose.
  1. Access Raw Media Streams:
  • Utilize the Raw Data APIs provided by the SDK to extract audio and video streams:
    • Video will be received as I420 raw frames.
    • Audio will be in PCM 16LE raw format.
  • You will need to handle the encoding and processing of these streams yourself after extraction.
  1. Scaling for Concurrency:

If you need to join multiple meetings simultaneously, you can run multiple instances of your bot across different servers. For Linux-based deployments, consider using separate Docker containers for each instance to manage resources effectively.

Third-Party Alternatives:

  • If building and maintaining the bot infrastructure seems daunting, consider using third-party services like Recall.ai, which provides a notetaker API that can capture raw audio and video without the need for extensive setup. Check out this blog to learn how to build a Zoom notetaker using Recall.ai

If instead you decide to build using the Zoom SDK, you'll need to familiarize yourself with the Zoom SDK recording permissions

By following these steps, you can successfully create a bot that captures audio and video from Zoom meetings for various applications, including transcription and analysis.

Zoom Developer Forum Examples

Some examples of this question are:

Written By:
Aydin Schwartz