Home Blog RTSP Streams for AI Pipelines
AI & Vision

The Pain of Getting RTSP Streams to Cloud AI Pipelines — and How We Approached It

📅 March 22, 2026 ⏱️ 9 min read 🏷️ AI, RTSP, Video Analytics, Security

If you've ever tried to pipe live IP camera streams into a cloud AI pipeline — object detection, people counting, license plate recognition, anomaly detection — you know the options aren't great. The cameras live on a private LAN. Your AI servers live in the cloud. And the gap between them is surprisingly painful to bridge securely, reliably, and without an unreasonable engineering investment.

This is a problem we ran into constantly when talking to security operators, integrators, and AI vendors. So we built TheRelay specifically to solve it. Here's an honest breakdown of why the problem is hard, what it actually costs, and how we approached it differently.

Why Bridging Cameras to AI Is Harder Than It Looks

IP cameras broadcast RTSP on your local network — typically on port 554. That stream is accessible from any device on the same LAN, making it easy for on-premise NVRs and VMS software to consume it. But the moment you want a cloud service to process that video — whether it's your own model or a third-party AI vendor — you hit a wall.

Cloud servers cannot reach private LAN devices. Your router's NAT blocks all inbound connections. The camera has no idea the internet exists. And the AI server, sitting in AWS or GCP, has no path to 192.168.1.100:554.

This leaves you with a short list of options, all of which have significant problems:

  • Port forward port 554 — opens your camera to the entire internet
  • Set up a VPN — works, but requires VPN client software on every AI server, and doesn't scale to external vendors
  • Self-host a relay server — pulls the stream locally, re-publishes to the cloud; works, but engineering cost is high
  • Run AI on-premise — avoids the problem but limits your AI options and hardware choices

And if you want to evaluate multiple AI vendors before committing? Now you're multiplying the complexity. Each vendor needs access to your streams. Each integration is a separate tunnel, credential share, or firewall rule.

The Obvious (Bad) Option: Port Forwarding RTSP

Port forwarding is the path of least resistance. You log into your router, forward port 554 to your camera, and hand the public IP to your AI vendor. It works — until it doesn't.

The security reality: Port forwarding RTSP means your camera's authentication interface is reachable by anyone on the internet, 24/7. Hikvision's CVE-2021-36260 (CVSS 9.8) allowed unauthenticated remote code execution via a crafted HTTP request. Dahua's CVE-2021-33044 and CVE-2021-33045 allowed complete authentication bypass. Thousands of cameras with these unpatched vulnerabilities are indexed on Shodan right now.

Beyond the security risk, port forwarding has practical problems for AI integrations:

  • Dynamic IPs break everything. Most ISPs change your public IP periodically. Any integration that hardcodes your IP will break without warning.
  • Multiple cameras require multiple ports. Camera 1 gets 554, camera 2 gets 555, camera 3 gets 556. Managing this across 10, 20, or 50 cameras is a maintenance nightmare.
  • No access control. Once the port is open, anyone who discovers it can attempt to connect. There's no token revocation, no per-consumer access scope, no audit trail.
  • Credential exposure. To connect to an RTSP stream, you typically need a username and password. Sharing camera credentials with AI vendors means those credentials are now outside your control forever. Rotating them requires updating every integration.
  • CGNAT makes it impossible for many users. ISPs increasingly use Carrier-Grade NAT — multiple customers sharing a single public IP. Port forwarding simply doesn't work in this scenario, full stop.

The Approach We Took at TheRelay

We built a lightweight agent that runs on your local network — on any macOS, Windows, or Linux machine that has access to your cameras. The agent reads RTSP streams locally (camera credentials never leave the LAN), and pushes them outbound to TheRelay Cloud over an encrypted SRT connection.

Because the connection is outbound, there is zero firewall or router configuration. No ports opened. No router access needed. No CGNAT issues. The agent connects to the cloud the same way your browser opens HTTPS — from the inside out.

Your AI pipeline then consumes the stream from a secure cloud endpoint. You get RTSP, SRT, HLS, RTMP, and WebRTC output — so it works with whatever protocol your AI framework speaks. Each endpoint is access-controlled by a token you generate and can revoke at any time.

Why We Made Certain Design Choices

🔒
SRT with AES encryption for ingest

RTSP over the open internet is fragile — it was designed for LANs and doesn't handle packet loss, jitter, or NAT traversal gracefully. SRT (Secure Reliable Transport) has built-in retransmission, congestion control, and AES-128/256 encryption. For the LAN-to-cloud leg, it's a dramatically more reliable transport than raw RTSP, with encryption that RTSP simply doesn't have.

🏠
Camera credentials stay on the LAN

The agent authenticates to the camera locally using the RTSP URL you provide (including username and password). Those credentials are never transmitted to the cloud — TheRelay Cloud only receives the encrypted video stream. If you revoke access or cancel your account, your camera passwords are never at risk of exposure.

🎫
Token-scoped access per consumer

Every AI vendor, developer, or team that needs stream access gets their own access token. You can scope a token to a single camera, a group, or an entire site. Tokens can be revoked instantly — no password rotation, no camera reconfiguration, no coordination with the consumer. When a vendor's evaluation is over, one click ends their access.

🔌
Multi-protocol output

Cloud AI pipelines don't all speak the same protocol. Some use RTSP directly (OpenCV, GStreamer, FFmpeg-based pipelines). Some consume HLS. Broadcast tools prefer SRT. Browser-based dashboards need WebRTC. Rather than forcing a single output format, we serve all of them from the same cloud endpoint — one stream in, five protocol outputs out.

📡
ONVIF auto-discovery

Manually finding the RTSP URL for every camera in a large deployment is tedious and error-prone. The agent can broadcast ONVIF WS-Discovery on the LAN and automatically enumerate all ONVIF-compliant cameras — importing their names, RTSP paths, and stream profiles without requiring manual lookup.

Testing Multiple AI Vendors on Real Streams

One of the more unexpected use cases that came out of our token model: running parallel vendor evaluations on real production streams.

The typical enterprise AI evaluation process looks like this: sign an NDA, spend two weeks getting the vendor a test stream, evaluate for a month, off-board them, repeat with the next vendor. The off-boarding alone — rotating camera passwords, closing firewall rules, confirming they've lost access — takes hours per vendor and involves your network team.

With TheRelay, the evaluation model is completely different:

  • Spin up a token for Vendor A. Give them the endpoint URL and token. They connect and start evaluating — typically the same day.
  • Spin up a separate token for Vendor B. Same camera, different token. Both vendors are now consuming your actual live stream in parallel.
  • Run your evaluation — detection quality, latency, false positive rate — using real footage from your actual cameras, not curated test clips.
  • When Vendor A doesn't meet your requirements: revoke the token. They lose access immediately. No password changes. No camera reconfiguration. No coordination.
  • Vendor B keeps running. Add Vendor C whenever you're ready.

The key point: vendor evaluations based on real streams produce fundamentally different results than evaluations on demo footage. Your cameras have specific lighting conditions, camera angles, compression settings, and scene types. An AI model that performs well on curated test video may fail on your actual footage — and you only find that out after committing to a contract. Real-stream testing, done safely with token-scoped access, changes that calculus entirely.

The Token Lifecycle for Vendor Evaluations

Stage Old way (port forward) With TheRelay tokens
Grant access Open firewall rule, share camera password Generate token, share endpoint URL
Credential exposure Camera password sent to vendor No camera credentials shared
Multiple vendors simultaneously Possible, chaotic to manage One token per vendor, fully isolated
Revoke access Close firewall rule + rotate camera password Revoke token — instant, zero coordination
Audit who has access No visibility Token list in dashboard
Time to onboard vendor Hours to days Minutes

How to Set It Up in Practice

Getting your RTSP cameras to a cloud AI endpoint with TheRelay takes about 10 minutes the first time.

  1. Create a TheRelay account Sign up at app.therelay.net. The first stream is free.
  2. Install the agent on your LAN Download the agent for macOS, Windows, or Linux and install it on any machine that can reach your cameras. It runs as a background service and connects outbound to the cloud — no inbound ports needed.
  3. Add your cameras Paste your RTSP URLs into the dashboard, or let the agent run ONVIF auto-discovery on your LAN. Hikvision example:
    rtsp://username:password@192.168.1.100:554/Streaming/Channels/101
    Credentials stay local. The agent reads the stream and pushes it encrypted to the cloud.
  4. Create access tokens In the dashboard, create a token scoped to the camera(s) you want to share. For vendor evaluations, create one token per vendor so you can revoke them independently.
  5. Share the cloud endpoint with your AI pipeline or vendor Your AI servers can now consume the stream using standard protocols:
    # RTSP — works with OpenCV, GStreamer, FFmpeg, most ML frameworks
    rtsp://stream.therelay.net/camera/{id}?token={token}
    
    # HLS — works in any browser or media player
    https://stream.therelay.net/hls/{id}/index.m3u8?token={token}
    
    # SRT — for broadcast tools and low-latency pipelines
    srt://stream.therelay.net:9000?streamid={id}&token={token}
    
    # WebRTC — sub-second latency in the browser
    https://stream.therelay.net/embed/webrtc/{id}?token={token}
  6. Revoke when done When an evaluation is complete or a vendor's access should end, revoke their token in the dashboard. Access is terminated immediately. No other action required.

Start Testing AI Vendors on Real Streams

One stream free. No port forwarding. Camera credentials stay on your LAN. Revoke vendor access in one click.

Get Started Free →

Frequently Asked Questions

Can cloud AI models consume RTSP streams directly from IP cameras?

Not without network access to the camera. IP cameras sit on private LANs that cloud servers cannot reach. Without port forwarding, a VPN, or a relay service, there is no path from a cloud AI server to 192.168.x.x:554. Port forwarding works but exposes your cameras to the internet with no protection. A cloud relay is the secure alternative — it tunnels the stream outbound from your LAN, and the AI server consumes a cloud endpoint instead.

How do I share camera streams with an AI vendor without exposing my camera passwords?

Use a relay with token-based access control. With TheRelay, you generate a scoped token in the dashboard and share the cloud endpoint URL + token with the vendor. Your camera credentials never leave your LAN — the vendor only receives the encrypted video stream via their token. When the evaluation ends, revoke the token and they lose access instantly. No password rotation, no camera reconfiguration.

Can I run multiple AI vendors on the same camera stream simultaneously?

Yes. TheRelay lets you create multiple tokens for the same stream. Each vendor gets their own token with no visibility into other consumers or your local setup. All vendors receive the same live stream in parallel — you don't need multiple cameras, duplicate streams, or separate relay instances. Revoke tokens independently as evaluations complete.

Why is SRT better than RTSP for the LAN-to-cloud leg?

RTSP was designed for local networks. Over the open internet, it struggles with packet loss, jitter, and NAT traversal — and it has no built-in encryption. SRT (Secure Reliable Transport) was built for unreliable networks: it has ARQ retransmission, congestion control, and AES-128/256 encryption baked in. For the LAN-to-cloud transport leg, SRT produces more stable, more secure streams than raw RTSP over a tunnel.

Does this work behind CGNAT?

Yes. CGNAT (Carrier-Grade NAT) means your ISP assigns a shared public IP, making port forwarding impossible — your router doesn't actually own the public IP. Because TheRelay uses outbound-only connections (same as HTTPS from a browser), the agent works perfectly behind CGNAT, double NAT, or any other network topology. No public IP needed.