Live Streaming Video Chat App using cv2 module of Python

Sampreethi Bokka
4 min readJun 12, 2021

--

We will be using CV2 module of OpenCV library for capturing the video.

Some Basics we need to keep in mind before starting:

  1. While transferring data over the network, the data must be in bytes. In this task we are going to use Python as a language for Live Streaming. So, from transferring data perspective we have to convert Python Objects into stream of bytes before sending to the receiver.
  2. So in Python we have Pickle library by which we can convert Python Objects to stream of bytes. As video is nothing but the continuous clicking of pictures, and from computer perspective images are just numbers.
  3. In this task we will see how to use cv2 module to connect to the webcam, so we can click pictures, and when pictures are continuous clicked it is called video.
  4. So here our single clicked photo is a Python Object. Before we transfer it to network we have to convert it into stream of bytes.
  5. When we click our picture it is stored in the form of arrays in machines.
  6. By using dumps() function available in pickle , our image of type numpy.ndarray gets converted to bytes.

When I started this and created a server and receiver python files.

  1. Imported cv2 library, socket library and pickle library.

To create a Live Streaming Video Chat App using the cv2 module of Python, we have used the socket programming to connect the two ends of the network to connect to each other. There are two key nodes here:

  • The Sender(Server)
  • The Receiver(Client)

What is socket programming?

Socket programming is a way of connecting two nodes on a network to communicate with each other. One socket(node) listens on a particular port at an IP, while other socket reaches out to the other to form a connection. Server forms the listener socket while client reaches out to the server. A socket is created with no name. A remote process has no way to refer to a socket until an address is bound to the socket.

We’re also going to use an important module called OpenCV where it is a library of Python bindings designed to solve computer vision problems. All the OpenCV array structures are converted to and from Numpy arrays.

server.py

import socket
import cv2
import pickle
import struct

# Socket Create
server_socket = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
host_name = socket.gethostname()
host_ip = socket.gethostbyname(host_name)
print('HOST IP:',host_ip)
port = 1234
socket_address = (host_ip,port)

# Socket Bind
server_socket.bind(socket_address)

# Socket Listen
server_socket.listen(1)
print("LISTENING AT:",socket_address)

# Socket Accept
while True:
client_socket,addr = server_socket.accept()
print('GOT CONNECTION FROM:',addr)
if client_socket:
vid = cv2.VideoCapture(0)

while(vid.isOpened()):
ret,image = vid.read()
img_serialize = pickle.dumps(image)
message = struct.pack("Q",len(img_serialize))+img_serialize
client_socket.sendall(message)

cv2.imshow('VIDEO FROM SERVER',image)
key = cv2.waitKey(10)
if key ==13:
client_socket.close()

Client

import socket,cv2, pickle,struct

# create socket
client_socket = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
# server ip address here
host_ip = '192.168.38.23'
port = 1234
client_socket.connect((host_ip,port))
data = b""
metadata_size = struct.calcsize("Q")
while True:
while len(data) < metadata_size:
packet = client_socket.recv(4*1024)
if not packet: break
data += packet
packed_msg_size = data[:metadata_size]
data = data[metadata_size:]
msg_size = struct.unpack("Q",packed_msg_size)[0]

while len(data) < msg_size:
data += client_socket.recv(4*1024)
frame_data = data[:msg_size]
data = data[msg_size:]
frame = pickle.loads(frame_data)
cv2.imshow("RECEIVING VIDEO",frame)
key = cv2.waitKey(10)
if key == 13:
break

client_socket.close()

We run the server.py and after it we try to run client.py, pictures are transferred blazingly fast hence creating a video streaming.

Now let’s convert it into video chat app in which we have two clients. Client A and Client B.

This setup you can perform if you have 2 laptops using the Wi-Fi.

These will be:

  1. Client A sending to Client B
  2. Client B receiving from Client A
  3. Client B sending to Client A
  4. Client A receiving from Client B

In this case we will need to use the concept of multi-threading. We have to start 2 threads in Client A and Client B simultaneously. One thread will send the video and other will receive the video.

Client A

import socket, cv2, pickle,struct, threading, time

# Socket Create
s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)

# Socket Accept
def sender():
time.sleep(15)
host_name = socket.gethostname()
host_ip = socket.gethostbyname(host_name)
print('HOST IP:',host_ip)
port = 1234
socket_address = (host_ip,port)
# Socket Bind
s.bind(socket_address)
# Socket Listen
s.listen(5)
print("LISTENING AT:",socket_address)
while True:
client_socket,addr = s.accept()
print('GOT CONNECTION FROM:',addr)
if client_socket:
vid = cv2.VideoCapture(0)

while(vid.isOpened()):
ret,image = vid.read()
img_serialize = pickle.dumps(image)
message = struct.pack("Q",len(img_serialize))+img_serialize
client_socket.sendall(message)

cv2.imshow('VIDEO FROM SERVER',image)
key = cv2.waitKey(10)
if key ==13:
client_socket.close()

def connect_server():
host_ip = '192.168.38.23'
port = 1234
s.connect((host_ip,port))
data = b""
metadata_size = struct.calcsize("Q")
while True:
while len(data) < metadata_size:
packet = s.recv(4*1024)
if not packet: break
data+=packet
packed_msg_size = data[:metadata_size]
data = data[metadata_size:]
msg_size = struct.unpack("Q",packed_msg_size)[0]

while len(data) < msg_size:
data += s.recv(4*1024)
frame_data = data[:msg_size]
data = data[msg_size:]
frame = pickle.loads(frame_data)
cv2.imshow("RECEIVING VIDEO",frame)
key = cv2.waitKey(10)
if key == 13:
break
s.close()


x1 = threading.Thread(target=sender)

x2 = threading.Thread(target=connect_server)

# start a thread
x1.start()
x2.start()

Client B

import socket,cv2, pickle,struct, time
import threading
# create socket
s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
def connect_server():
time.sleep(15)
host_ip = '192.168.38.23'
port = 1234
s.connect((host_ip,port))
data = b""
metadata_size = struct.calcsize("Q")
while True:
while len(data) < metadata_size:
packet = s.recv(4*1024)
if not packet:
break
data+=packet
packed_msg_size = data[:metadata_size]
data = data[metadata_size:]
msg_size = struct.unpack("Q",packed_msg_size)[0]

while len(data) < msg_size:
data += s.recv(4*1024)
frame_data = data[:msg_size]
data = data[msg_size:]
frame = pickle.loads(frame_data)
cv2.imshow("RECEIVING VIDEO",frame)
key = cv2.waitKey(10)
if key == 13:
break
s.close()

def sender():
host_name = socket.gethostname()
host_ip = socket.gethostbyname(host_name)
print('HOST IP:',host_ip)
port = 1234
socket_address = (host_ip,port)
# Socket Bind
s.bind(socket_address)
# Socket Listen
s.listen(5)
print("LISTENING AT:",socket_address)
while True:
client_socket,addr = s.accept()
print('GOT CONNECTION FROM:',addr)
if client_socket:
vid = cv2.VideoCapture(1)

while(vid.isOpened()):
ret,image = vid.read()
img_serialize = pickle.dumps(image)
message = struct.pack("Q",len(img_serialize))+img_serialize
client_socket.sendall(message)

cv2.imshow('VIDEO FROM SERVER',image)
key = cv2.waitKey(10)
if key ==13:
client_socket.close()


x1 = threading.Thread(target=connect_server)
x2 = threading.Thread(target=sender)


x1.start()
x2.start()

!!!Thanks for reading this article!!!

--

--

Sampreethi Bokka
Sampreethi Bokka

Written by Sampreethi Bokka

Intern at LinuxWorld informatics Pvt Ltd student from vellore institute of technology.

No responses yet