Surveillance System Wiki
Overview
This surveillance system is designed to process video feeds using a distributed Edge-to-Cloud architecture. Its primary goal is to identify and alert residents and authorities about dangerous animals detected in video footage. The system integrates edge devices (for real-time processing) and cloud servers (for intensive tasks like object recognition), ensuring quick and accurate responses.
Table of Contents
- Overview
- System Components
- Docker Compose Tutorial
-
Deploying on K3S using Enoslib on Grid'5000
- Introduction
- Step 0: Prerequisites
- Step 1: Initialize Logging
- Step 2: Cluster Resource Configuration
- Step 3: Kubernetes Setup
- Step 4: Label Nodes
- Step 5: Docker Registry Secret
- Step 6: Send Files to Cluster
- Step 7: Helm Installation and Prometheus Setup
- Step 8: Chaos Mesh Installation
- Step 9: Patch Prometheus Service
- Step 10: Fetch Webpage Hosts
- Step 11: Token Creation and Cleanup
- Conclusion
System Components
Camera
- Function: Captures video frames, resizes them for optimized processing, and serializes and transmits them to the Motion Detection service.
-
Key Features:
- Handles video sequences based on detected animal types.
- Configurable appearance frequency for specific events.
- Tracks FPS rate and integrates with TracerProvider for distributed tracing.
- Connection: Maintains a persistent TCP connection for frame transmission.
Motion Detection
- Function: Processes frames from the Camera to detect significant motion and forwards relevant frames to Object Recognition.
-
Key Features:
- Motion detection using grayscale and Gaussian blur.
- Real-time performance monitoring of CPU usage, frame processing time, and FPS.
- Integrates with TracerProvider for tracing frame reception, processing, and transmission.
- Connection: Listens for incoming frames from the Camera and sends processed frames to Object Recognition.
Object Recognizer
- Function: Uses YOLO to detect objects in frames and tracks performance metrics like processing time and response time.
-
Key Features:
- YOLO v3 for object detection.
- Distributed tracing for end-to-end monitoring.
- Supports multiple clients with concurrency.
- Connection: Listens for incoming frames from the Motion Detection service and processes them asynchronously.
Other Components
OpenTelemetry Collector
- Function: Collects, processes, and exports telemetry data (traces, metrics, logs).
-
Metrics Sent:
- Collects data from Camera, Motion Detection, Object Recognition.
- Sends telemetry data to Prometheus or other backends for visualization.
cAdvisor
- Function: Monitors container resource usage (CPU, memory, network).
-
Metrics Sent:
- CPU, memory, network I/O, disk I/O, container lifespan metrics.
Node Exporter
- Function: Exports hardware and system performance metrics from Linux systems.
-
Metrics Sent:
- CPU load, memory usage, disk utilization, network stats, and system uptime.
Prometheus
- Function: Scrapes and stores metrics from various exporters and services.
-
Metrics Collected:
- Application, service, and system-level metrics (e.g., FPS, processing time, system performance).
Docker Compose Tutorial
Step 1: Build Docker Images for Services
-
Clone the repository:
First, clone the repository to your local machine:git clone https://gitlab.inria.fr/sikaddou/surveillance-system-edge-to-cloud-video-processing.git cd surveillance-system-edge-to-cloud-video-processing
-
Build the Docker images locally:
You don't need to navigate to each service's directory individually. From the root of the cloned repository, run the following commands to build the Docker images for each service:docker build -t camera:latest ./services/camera docker build -t motion_detector:latest ./services/motion_detector docker build -t object_recognizer:latest ./services/object_recognizer
These commands will automatically find the
Dockerfile
in each service's directory (./services/camera
,./services/motion_detector
, and./services/object_recognizer
). Thelatest
tag is used to mark the most recent image.
docker-compose.yml
Step 2: Create or Update Services Envirment Variables
Object Recognizer:
- No environment variables specified.
Motion Detector:
-
INDEX
: A unique identifier for each motion detector instance (e.g.,1
,2
,3
). -
OR_HOST
: The hostname of the object recognizer service (set toobject_recognizer
). -
OR_PORT
: The port on which the object recognizer service is listening (set to9999
).
Camera:
-
CAMERA
: Set totrue
, indicating that the container is a camera service. -
ANIMAL_NAME
: The name of the animal being observed by the camera (e.g.,tiger
,bear
,wolf
). -
APPEARANCE_RATE
: The frequency at which the animal appears in the footage (in frames per minute). -
MDHOST
: The hostname of the corresponding motion detector (e.g.,motion_detector_1
,motion_detector_2
,motion_detector_3
). -
MDPORT
: The port on which the corresponding motion detector is listening (set to9998
). -
INDEX
: A unique identifier for each camera (e.g.,1
,2
,3
).
Here's an example for reference. Make sure the docker-compose.yml
file is set up correctly:
version: '3'
services:
object_recognizer:
image: object_recognizer:latest
ports:
- "9999:9999"
- "5000:5000"
motion_detector_1:
image: motion_detector:latest
ports:
- "9998:9998"
depends_on:
- object_recognizer
- jaeger
environment:
- INDEX=1
- OR_HOST=object_recognizer
- OR_PORT=9999
command: python src/motion_detection.py
motion_detector_2:
image: motion_detector:latest
ports:
- "9997:9998"
depends_on:
- object_recognizer
- jaeger
environment:
- INDEX=2
- OR_HOST=object_recognizer
- OR_PORT=9999
command: python src/motion_detection.py
motion_detector_3:
image: motion_detector:latest
ports:
- "9996:9998"
depends_on:
- object_recognizer
- jaeger
environment:
- INDEX=3
- OR_HOST=object_recognizer
- OR_PORT=9999
command: python src/motion_detection.py
camera_1:
image: camera:latest
depends_on:
- motion_detector_1
- otel-collector
environment:
- CAMERA=true
- ANIMAL_NAME=tiger
- APPEARANCE_RATE=600
- MDHOST=motion_detector_1
- MDPORT=9998
- INDEX=1
command: /bin/sh -c "python src/camera.py"
camera_2:
image: camera:latest
depends_on:
- motion_detector_2
- otel-collector
environment:
- CAMERA=true
- ANIMAL_NAME=bear
- APPEARANCE_RATE=500
- MDHOST=motion_detector_2
- MDPORT=9998
- INDEX=2
command: /bin/sh -c "python src/camera.py"
camera_3:
image: camera:latest
depends_on:
- motion_detector_3
- otel-collector
environment:
- CAMERA=true
- ANIMAL_NAME=wolf
- APPEARANCE_RATE=700
- MDHOST=motion_detector_3
- MDPORT=9998
- INDEX=3
command: /bin/sh -c "python src/camera.py"
jaeger:
image: jaegertracing/all-in-one
ports:
- "16686:16686"
- "6831:6831/udp"
- "14268"
- "14250"
zipkin-all-in-one:
image: openzipkin/zipkin:latest
environment:
- JAVA_OPTS=-Xms1024m -Xmx1024m -XX:+ExitOnOutOfMemoryError
restart: always
ports:
- "9411:9411"
otel-collector:
image: otel/opentelemetry-collector-contrib
restart: always
command: ["--config=/etc/otel-collector-config.yaml", "${OTELCOL_ARGS}"]
volumes:
- ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
ports:
- "1888:1888" # pprof extension
- "8888:8888" # Prometheus metrics exposed by the collector
- "8889:8889" # Prometheus exporter metrics
- "13133:13133" # health_check extension
- "4317:4317" # OTLP gRPC receiver
- "55679:55679" # zpages extension
depends_on:
- jaeger
- zipkin-all-in-one
cadvisor:
image: gcr.io/cadvisor/cadvisor:latest
hostname: cadvisor
platform: linux/aarch64
volumes:
- "/:/rootfs:ro"
- "/var/run:/var/run:ro"
- "/sys:/sys:ro"
- "/var/lib/docker/:/var/lib/docker:ro"
- "/dev/disk/:/dev/disk:ro"
ports:
- "8080:8080"
prometheus:
image: prom/prometheus:latest
restart: always
volumes:
- ./prometheus.yaml:/etc/prometheus/prometheus.yml
- ./rules.yml:/etc/prometheus/rules.yml
ports:
- "9090:9090"
node-exporter:
image: prom/node-exporter:latest
container_name: node-exporter
restart: unless-stopped
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro"
command:
- '--path.procfs=/host/proc'
- '--path.rootfs=/rootfs'
- '--path.sysfs=/host/sys'
- '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
expose:
- 9100
Step 3: Deploy with Docker Compose
-
Start the services with Docker Compose: After updating the
docker-compose.yml
, run the following command to start the services locally:docker-compose -f ./deploy/docker-compose/docker-compose.yml up -d --build --force-recreate
-
Stop and remove existing containers: If you want to stop and remove any previously running containers and volumes, use the following command before running the above command:
docker-compose -f ./deploy/docker-compose/docker-compose.yml down --volumes --remove-orphans
Step 4: Verify the Deployment
-
Check the status of running containers: You can check if the containers are running properly with the following command:
docker ps
-
Access the services:
- Access Jaeger UI: http://localhost:16686
- Access Zipkin UI: http://localhost:9411
- Access Prometheus UI: http://localhost:9090
- Access cAdvisor UI: http://localhost:8080
- Access object recognizer interface : http://localhost:5000
Deploying on K3S using Enoslib on Grid'5000
This tutorial will walk you through the steps required to deploy a Kubernetes cluster on K3s using Enoslib on the Grid'5000 (G5k) infrastructure. The focus is on deploying the surveillance system, which includes components such as the Camera, Motion Detection, and Object Recognizer. These steps cover cluster configuration, deployment of Kubernetes resources, Helm setup, Prometheus and Chaos Mesh installation, and additional tasks for monitoring and resilience testing of the system.
Step 0: Prerequisites
- Grid'5000 account and access to the infrastructure.
- enoslib installed on your local machine.
- kubectl, helm, and ssh tools installed on your local machine.
- K3s installed on your machines.
Step 1: Initialize Logging
First, initialize the logging for the deployment process.
import enoslib as en
from datetime import datetime
# Initialize logging
_ = en.init_logging()
# Generate a timestamp for the job name
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
job_name = f"usecase_{timestamp}"
Step 2: Cluster Resource Configuration
Define the configuration for the cluster resources on Grid'5000. This will specify the number of machines for each role (master, agent, etc.).
cluster = "parasilo"
conf = (
en.VMonG5kConf
.from_settings(job_name=job_name, walltime="04:00:00")
.add_machine(roles=["master"], cluster=cluster, number=1, flavour="large")
.add_machine(roles=["agent", "object_recognizer"], cluster=cluster, number=1, flavour_desc={"core": 8, "mem": 8192})
.add_machine(roles=["agent", "motion_detector"], cluster=cluster, number=3, flavour_desc={"core": 4, "mem": 4096})
.add_machine(roles=["agent", "camera"], cluster=cluster, number=3, flavour_desc={"core": 1, "mem": 1024})
.finalize()
)
# Initialize the provider
provider = en.VMonG5k(conf)
roles, networks = provider.init()
en.wait_for(roles)
Step 3: Kubernetes Setup
Deploy the K3s cluster and ensure that it’s set up correctly.
# Initialize K3s with master and agent nodes
k3s = en.K3s(master=roles["master"], agent=roles["agent"])
k3s.deploy()
# Create a tunnel to the head node
print("Create a tunnel from your local machine to the head node:")
print(f"ssh -NL 8001:{roles['master'][0].address}:8001 {G5K_username}@access.grid5000.fr")
Step 4: Label Nodes
Label your nodes with specific attributes based on their roles (cloud, edge, camera).
def label_nodes(node_names, labels):
label_str = ",".join(f"{key}={value}" for key, value in labels.items())
for node in node_names:
command = f"kubectl label nodes {node} {label_str} --overwrite"
try:
result = en.run_command(command, roles=roles['master'])
print(f"Successfully labeled {node} with {label_str}")
except subprocess.CalledProcessError as e:
print(f"Failed to label {node}. Error: {e.stderr.decode('utf-8')}")
# Label nodes
label_nodes([str(node.alias) for node in roles["object_recognizer"]], {"pos": "cloud"})
label_nodes([str(node.alias) for node in roles["motion_detector"]], {"pos": "edge"})
label_nodes([str(node.alias) for node in roles["camera"]], {"pos": "camera"})
Step 5: Docker Registry Secret
Create a Docker registry secret with anonymized credentials to authenticate with Docker Hub.
docker_username = "username_placeholder"
docker_password = "password_placeholder"
docker_email = "email_placeholder"
# Create the Docker registry secret
en.run_command(f"kubectl create secret docker-registry dockerhub-secret \
--docker-username={docker_username} \
--docker-password={docker_password} \
--docker-email={docker_email}", roles=roles['master'])
Step 6: Send Files to Cluster
Transfer necessary configuration files to the cluster using scp
.
import os
def send_file(file_path, file_name):
scp_command = f"scp {file_path} root@{roles['master'][0].address}:{file_name}"
os.system(f"ssh-keygen -f /home/{G5k_username} /.ssh/known_hosts -R {roles['master'][0].address}")
os.system(scp_command)
return scp_command
# Send configuration files to the cluster
local_directory = "deploy/G5k/K3S/config"
for root, dirs, files in os.walk(local_directory):
for file_name in files:
send_file(local_directory + file_name, file_name)
# Apply Kubernetes manifests
en.run_command("kubectl apply -f deployement.yaml", roles=roles['master'])
en.run_command("kubectl apply -f configmap.yaml", roles=roles['master'])
Step 7: Helm Installation and Prometheus Setup
Install Helm, add Prometheus charts, and deploy Prometheus for monitoring.
with en.actions(roles=roles["master"]) as a:
a.raw("curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3")
a.raw("chmod 700 get_helm.sh")
a.raw("./get_helm.sh")
# Install Prometheus
a.raw("helm --kubeconfig /etc/rancher/k3s/k3s.yaml repo add prometheus-community https://prometheus-community.github.io/helm-charts")
a.raw("helm --kubeconfig /etc/rancher/k3s/k3s.yaml repo update")
a.raw("helm --kubeconfig /etc/rancher/k3s/k3s.yaml install prometheus prometheus-community/prometheus -n default --set grafana.enabled=True -f costum-values.yaml")
a.raw("helm --kubeconfig /etc/rancher/k3s/k3s.yaml upgrade --kubeconfig /etc/rancher/k3s/k3s.yaml prometheus prometheus-community/prometheus --namespace default --set prometheus-node-exporter.service.port=9500")
Step 8: Chaos Mesh Installation
Install Chaos Mesh using Helm for fault injection and resilience testing.
with en.actions(roles=roles["master"]) as a:
a.raw("helm --kubeconfig /etc/rancher/k3s/k3s.yaml repo add chaos-mesh https://charts.chaos-mesh.org")
a.raw("helm --kubeconfig /etc/rancher/k3s/k3s.yaml repo update")
a.raw("helm --kubeconfig /etc/rancher/k3s/k3s.yaml install chaos-mesh chaos-mesh/chaos-mesh --set chaosDaemon.runtime=containerd --set chaosDaemon.containerdSocket=/run/containerd/containerd.sock -n default")
a.raw("kubectl apply -f rbac.yaml")
a.raw("kubectl create token account-default-admin-eechh")
a.raw('kubectl patch svc chaos-dashboard -n default -p \'{"spec": {"ports": [{"name": "http", "protocol": "TCP", "port": 2333, "targetPort": 2333, "nodePort": 30312}]}}\'')
Step 9: Patch Prometheus Service
Expose Prometheus service on a NodePort for web access.
with en.actions(roles=roles["master"]) as a:
a.raw('kubectl patch svc prometheus-server -n default -p \'{"spec": {"type": "NodePort", "ports": [{"name": "http", "port": 80, "targetPort": 9090, "nodePort": 30090}]}}\'')
Step 10: Fetch Webpage Hosts
Retrieve the host IP of Prometheus and Chaos Dashboard services.
def get_host(service):
results = en.run_command(f"kubectl get pods -n kube-system -l app={service}", roles=roles['master'])
import re
pattern = r"host='([^']*)'"
match = re.search(pattern, str(results))
if match:
host = match.group(1)
print("Extracted host:", host)
else:
print("Host not found")
return None
results = en.run_command(f"kubectl describe node {host}", roles=roles['master'])
pattern = r"InternalIP:\s*([\d.]+)"
match = re.search(pattern, str(results))
if match:
return match.group(1)
else:
print("Host not found")
return None
# Fetch the Prometheus and Chaos Dashboard IPs
url = f"http://{get_host('prometheus')}:30090"
print(f"Prometheus web page host: {url}")
url = f"http://{get_host('chaos-dashboard')}:30312"
print(f"Chaos-mesh web page host: {url}")
Step 11: Token Creation and Cleanup
Create the necessary tokens and perform cleanup after the job is complete.
results = en.run_command("kubectl create token account-default-admin-eechh", roles=roles["master"])
for res in results:
print(res.payload['stdout'])
# Cleanup the resources
provider.destroy()
Conclusion
By following the steps above, you have successfully deployed the surveillence system on Kubernetes cluster using K3s on Grid'5000, set up monitoring with Prometheus, fault injection with Chaos Mesh, and labeled nodes for efficient management.