Zoom Developer Forum

How to capture Zoom audio and video with a meeting bot

Updated at:
October 14, 2025
Written By:
Aydin Schwartz

Question

People commonly ask on the Zoom Developer Forum:

How can I build a bot that automatically joins Zoom meetings, captures raw audio and video streams, and processes this data for tasks such as transcription or analysis? What are the necessary steps to implement this using the Zoom SDK, and how do concepts like meeting sessions relate to the meeting ID or URL? Additionally, can the bot send audio back into the meeting, and what are the considerations for handling multiple concurrent meetings?

Answer

Building a bot that joins Zoom meetings and captures audio and video streams can be accomplished using the Zoom SDK, particularly the Raw Data functionality. Here’s a comprehensive guide based on community insights:

  1. Provision Infrastructure:
  • Set up a server environment using cloud services such as AWS, GCP, or DigitalOcean to host your bot instances.
  1. Join Meetings:
  • Use the Zoom Meeting SDK (available for Windows, macOS, and Linux) to programmatically launch an instance of the Zoom client and join the desired meeting. You will need to provide the meeting ID or URL for this purpose.
  1. Access Raw Media Streams:
  • Utilize the Raw Data APIs provided by the SDK to extract audio and video streams:
    • Video will be received as I420 raw frames.
    • Audio will be in PCM 16LE raw format.
  • You will need to handle the encoding and processing of these streams yourself after extraction.
  1. Scaling for Concurrency:

If you need to join multiple meetings simultaneously, you can run multiple instances of your bot across different servers. For Linux-based deployments, consider using separate Docker containers for each instance to manage resources effectively.

Third-Party Alternatives:

  • If building and maintaining the bot infrastructure seems daunting, consider using third-party services like Recall.ai, which provides meeting bots that can capture raw audio and video without the need for extensive setup.

By following these steps, you can successfully create a bot that captures audio and video from Zoom meetings for various applications, including transcription and analysis.

Zoom Developer Forum Examples

Some examples of this question are:

Written By:
Aydin Schwartz