Thanks to visit codestin.com
Credit goes to github.com

Skip to content

TimDowker/viseron

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Viseron - Self-hosted NVR with object detection

Viseron is a self-hosted, local only NVR implemented in Python. The goal is ease of use while also leveraging hardware acceleration for minimal system load.

Notable features

  • Records videos on detected objects
  • Lookback, buffers frames to record before the event actually happened
  • Multiplatform, should support any x86-64 machine running Linux, aswell as RPi3.
    Builds are tested and verified on the following platforms:
    • Ubuntu 18.04 with Nvidia GPU
    • Ubuntu 18.04 running on an Intel NUC
    • RaspberryPi 3B+
  • Supports multiple different object detectors:
    • Yolo Darknet using OpenCV
    • Tensorflow via Google Coral EdgeTPU
  • Motion detection
  • Native support for RTSP and MJPEG
  • Supports hardware acceleration on different platforms
    • CUDA for systems with a supported GPU
    • OpenCL
    • OpenMax and MMAL on the RaspberryPi 3B+
  • Zones to limit detection to a particular area to reduce false positives
  • Masks to limit where motion detection occurs
  • Stop/start cameras on-demand over MQTT
  • Home Assistant integration via MQTT

Getting started

Choose the appropriate docker container for your machine. Builds are published to Docker Hub

On a RaspberryPi 3b+ Example Docker command
docker run --rm \
--privileged \
-v <recordings path>:/recordings \
-v <config path>:/config \
-v /etc/localtime:/etc/localtime:ro \
-v /dev/bus/usb:/dev/bus/usb \
-v /opt/vc/lib:/opt/vc/lib \
--name viseron \
--device /dev/vchiq:/dev/vchiq --device /dev/vcsm:/dev/vcsm \
roflcoopter/viseron-rpi:latest

Example docker-compose

version: "2.4"
services:
  viseron:
    image: roflcoopter/viseron-rpi:latest
    container_name: viseron
    volumes:
      - <recordings path>:/recordings
      - <config path>:/config
      - /etc/localtime:/etc/localtime:ro
      - /dev/bus/usb:/dev/bus/usb
      - /opt/vc/lib:/opt/vc/lib
    devices:
      - /dev/vchiq:/dev/vchiq
      - /dev/vcsm:/dev/vcsm
    privileged: true

Note: Viseron is quite RAM intensive, mostly because of the object detection but also because of the lookback feature.
I do not recommend using an RPi unless you have a Google Coral EdgeTPU, the CPU is not fast enough and you might run out of memory. To make use of hardware accelerated decoding/encoding you might have to increase the allocated GPU memory.
To do this edit /boot/config.txt and set gpu_mem=256 and then reboot.

On a generic Linux machine

Example Docker command

docker run --rm \
-v <recordings path>:/recordings \
-v <config path>:/config \
-v /etc/localtime:/etc/localtime:ro \
--name viseron \
roflcoopter/viseron:latest

Example docker-compose

version: "2.4"

services:
  viseron:
    image: roflcoopter/viseron:latest
    container_name: viseron
    volumes:
      - <recordings path>:/recordings
      - <config path>:/config
      - /etc/localtime:/etc/localtime:ro
On a Linux machine with Intel CPU that supports VAAPI (Intel NUC for example)

Example Docker command

docker run --rm \
-v <recordings path>:/recordings \
-v <config path>:/config \
-v /etc/localtime:/etc/localtime:ro \
--name viseron \
--device /dev/dri \
roflcoopter/viseron-vaapi:latest

Example docker-compose

version: "2.4"

services:
  viseron:
    image: roflcoopter/viseron-vaapi:latest
    container_name: viseron
    volumes:
      - <recordings path>:/recordings
      - <config path>:/config
      - /etc/localtime:/etc/localtime:ro
    devices:
      - /dev/dri
On a Linux machine with Nvidia GPU

Example Docker command

docker run --rm \
-v <recordings path>:/recordings \
-v <config path>:/config \
-v /etc/localtime:/etc/localtime:ro \
--name viseron \
--runtime=nvidia \
roflcoopter/viseron-cuda:latest

Example docker-compose

version: "2.4"

services:
  viseron:
    image: roflcoopter/viseron-cuda:latest
    container_name: viseron
    volumes:
      - <recordings path>:/recordings
      - <config path>:/config
      - /etc/localtime:/etc/localtime:ro
    runtime: nvidia

VAAPI support is built into every container. To utilize it you need to add --device /dev/dri to your docker command.
EdgeTPU support is also included in all containers. To use it, add -v /dev/bus/usb:/dev/bus/usb --privileged to your docker command.

The config.yaml has to be mounted to the folder /config.
If no config is present, a default minimal one will be created.
Here you need to fill in atleast your cameras and you should be good to go.

Configuration Options

Cameras

Config example
cameras:
  - name: Front door
    mqtt_name: viseron_front_door
    host: 192.168.30.2
    port: 554
    username: user
    password: pass
    path: /Streaming/Channels/101/
    width: 1920
    height: 1080
    fps: 6 
    motion_detection:
      interval: 1
      trigger_detector: false
    object_detection:
      interval: 1
      labels:
        - label: person
          confidence: 0.9
        - label: pottedplant
          confidence: 0.9

Used to build the FFMPEG command to decode camera stream.
The command is built like this:
"ffmpeg" + global_args + input_args + hwaccel_args + codec + "-rtsp_transport tcp -i " + (stream url) + filter_args + output_args

Name Type Default Supported options Description
name str required any string Friendly name of the camera
mqtt_name str name given above any string Name used in MQTT topics
stream_format str rtsp rtsp, mjpeg FFMPEG stream format
host str required any string IP or hostname of camera
port int required any integer Port for the camera stream
username str optional any string Username for the camera stream
password str optional any string Password for the camera stream
path str optional any string Path to the camera stream, eg /Streaming/Channels/101/
width int detected from stream any integer Width of the stream. Will use OpenCV to get this information if not given
height int detected from stream any integer Height of the stream. Will use OpenCV to get this information if not given
fps int detected from stream any integer FPS of the stream. Will use OpenCV to get this information if not given
global_args list optional a valid list of FFMPEG arguments See source code for default arguments
input_args list optional a valid list of FFMPEG arguments See source code for default arguments
hwaccel_args list optional a valid list of FFMPEG arguments FFMPEG decoder hardware acceleration arguments
codec str optional any supported decoder codec FFMPEG video decoder codec, eg h264_cuvid
rtsp_transport str tcp tcp, udp, udp_multicast, http Sets RTSP transport protocol. Change this if your camera doesnt support TCP
filter_args list optional a valid list of FFMPEG arguments See source code for default arguments
motion_detection dictionary optional see Camera motion detection config Overrides the global motion_detection config
object_detection dictionary optional see Camera object detection config below Overrides the global object_detection config
zones list optional see Zones config below Allows you to specify zones to further filter detections
publish_image bool false true/false If enabled, Viseron will publish an image to MQTT with drawn zones, objects, motion and masks.
Note: this will use some extra CPU and should probably only be used for debugging
logging dictionary optional see Logging Overrides the global log settings for this camera.
This affects all logs named lib.nvr.<camera name>.* and lib.*.<camera name>

A default ffmpeg decoder command is generated, which varies a bit depending on the Docker container you use,

For Nvidia GPU support in the roflcoopter/viseron-cuda image
ffmpeg -hide_banner -loglevel panic -avoid_negative_ts make_zero -fflags nobuffer -flags low_delay -strict experimental -fflags +genpts -stimeout 5000000 -use_wallclock_as_timestamps 1 -vsync 0 -c:v h264_cuvid -rtsp_transport tcp -i rtsp://<username>:<password>@<host>:<port><path> -f rawvideo -pix_fmt nv12 pipe:1
For VAAPI support in the roflcoopter/viseron-vaapi image
ffmpeg -hide_banner -loglevel panic -avoid_negative_ts make_zero -fflags nobuffer -flags low_delay -strict experimental -fflags +genpts -stimeout 5000000 -use_wallclock_as_timestamps 1 -vsync 0 -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -rtsp_transport tcp -i rtsp://<username>:<password>@<host>:<port><path> -f rawvideo -pix_fmt nv12 pipe:1
For RPi3 in the roflcoopter/viseron-rpi image
ffmpeg -hide_banner -loglevel panic -avoid_negative_ts make_zero -fflags nobuffer -flags low_delay -strict experimental -fflags +genpts -stimeout 5000000 -use_wallclock_as_timestamps 1 -vsync 0 -c:v h264_mmal -rtsp_transport tcp -i rtsp://<username>:<password>@<host>:<port><path> -f rawvideo -pix_fmt nv12 pipe:1

This means that you do not have to set hwaccel_args unless you have a specific need to change the default command (say you need to change h264_cuvid to hevc_cuvid)


Camera motion detection

Name Type Default Supported options Description
interval float 1.0 any float Run motion detection at this interval in seconds on the most recent frame.
For optimal performance, this should be divisible with the object detection interval, because then preprocessing will only occur once for each frame.
trigger_detector bool true True/False If true, the object detector will only run while motion is detected.
timeout bool true True/False If true, recording will continue until no motion is detected
max_timeout int 30 any integer Value in seconds for how long motion is allowed to keep the recorder going when no objects are detected.
This is to prevent never-ending recordings.
Only applicable if timeout: true.
width int 300 any integer Frames will be resized to this width in order to save computing power
height int 300 any integer Frames will be resized to this height in order to save computing power
area float 0.1 any float How big the detected area must be in order to trigger motion
frames int 3 any integer Number of consecutive frames with motion before triggering, used to reduce false positives
mask list optional see Mask config Allows you to specify masks in the shape of polygons.
Use this to ignore motion in certain areas of the image
logging dictionary optional see Logging Overrides the camera/global log settings for the motion detector.
This affects all logs named lib.motion.<camera name> and lib.nvr.<camera name>.motion

Mask

Config example
cameras:
  - name: name
    host: ip
    port: port
    path: /Streaming/Channels/101/
    motion_detection:
      area: 0.07
      mask:
        - points:
            - x: 0
              y: 0
            - x: 250
              y: 0
            - x: 250
              y: 250
            - x: 0
              y: 250
        - points:
            - x: 500
              y: 500
            - x: 1000
              y: 500
            - x: 1000
              y: 750
            - x: 300
              y: 750
Name Type Default Supported options Description
points list required a list of points Used to draw a polygon of the mask

Masks are used to exclude certain areas in the image from triggering motion.
Say you have a camera which is filming some trees. When the wind is blowing, motion will probably be detected.
Draw a mask over these trees and they will no longer trigger said motion.


Camera object detection

Name Type Default Supported options Description
interval float optional any float Run object detection at this interval in seconds on the most recent frame. Overrides global config
labels list optional any float A list of labels. Overrides global config.
logging dictionary optional see Logging Overrides the camera/global log settings for the object detector.
This affects all logs named lib.nvr.<camera name>.object

Zones

Config example
cameras:
  - name: name
    host: ip
    port: port
    path: /Streaming/Channels/101/
    zones:
      - name: zone1
        points:
          - x: 0
            y: 500
          - x: 1920
            y: 500
          - x: 1920
            y: 1080
          - x: 0
            y: 1080
        labels:
          - label: person
            confidence: 0.9
      - name: zone2
        points:
          - x: 0
            y: 0
          - x: 500
            y: 0
          - x: 500
            y: 500
          - x: 0
            y: 500
        labels:
          - label: cat
            confidence: 0.5
Name Type Default Supported options Description
name str required any str Zone name, used in MQTT topic. Should be unique
points list required a list of points Used to draw a polygon of the zone
labels list optional any float A list of labels to track in the zone. Overrides global config.

Points

Points are used to form a polygon.

Name Type Default Supported options Description
x int required any int X-coordinate of point
y int required any int Y-coordinate of point
To easily genereate points you can use a tool like image-map.net.\
Just upload an image from your camera and start drawing your zone.\
Then click Show me the code! and adapt it to the config format.\
Coordinates coords="522,11,729,275,333,603,171,97" should be turned into this:
points:
  - x: 522
    y: 11
  - x: 729
    y: 275
  - x: 333
    y: 603
  - x: 171
    y: 97

Object detection

Config example
object_detection:
  type: darknet
  interval: 6
  labels:
    - label: person
      confidence: 0.9
      height_min: 0.1481
      height_max: 0.7
      width_min: 0.0598
      width_max: 0.36
    - label: truck
      confidence: 0.8
Name Type Default Supported options Description
type str RPi: edgetpu
Other: darknet
darknet, edgetpu What detection method to use.
Defaults to edgetpu on RPi. If no EdgeTPU is present it will run tensorflow on the CPU.
model_path str RPi: /detectors/models/edgetpu/model.tflite
Other: /detectors/models/darknet/yolo.weights
any valid path Path to the object detection model
model_config str /detectors/models/darknet/yolo.cfg any valid path Path to the object detection config. Only needed for darknet
label_path str RPI: /detectors/models/edgetpu/labels.txt
Other: /detectors/models/darknet/coco.names
any valid path Path to the file containing labels for the model
model_width int optional any integer Detected from model. Frames will be resized to this width in order to fit model and save computing power. I dont recommend changing this.
model_height int optional any integer Detected from model. Frames will be resized to this height in order to fit model and save computing power. I dont recommend changing this.
interval float 1.0 any float Run object detection at this interval in seconds on the most recent frame.
confidence float 0.8 float between 0 and 1 Lowest confidence allowed for detected objects
suppression float 0.4 float between 0 and 1 Non-maxima suppression, used to remove overlapping detections
labels list optional a list of labels Global labels which applies to all cameras unless overridden
logging dictionary optional see Logging Overrides the global log settings for the object detector.
This affects all logs named lib.detector and lib.nvr.<camera name>.object

Labels

Name Type Default Supported options Description
label str person any string Can be any label present in the detection model
confidence float 0.8 float between 0 and 1 Lowest confidence allowed for detected objects.
The lower the value, the more sensitive the detector will be, and the risk of false positives will increase
height_min float 0 float between 0 and 1 Minimum height allowed for detected objects, relative to stream height
height_max float 1 float between 0 and 1 Maximum height allowed for detected objects, relative to stream height
width_min float 0 float between 0 and 1 Minimum width allowed for detected objects, relative to stream width
width_max float 1 float between 0 and 1 Maximum width allowed for detected objects, relative to stream width
triggers_recording bool True True/false If set to True, objects matching this filter will start the recorder and signal over MQTT.
If set to False, only signal over MQTT will be sent

Labels are used to tell Viseron what objects to look for and keep recordings of.
The available labels depends on what detection model you are using.
For the built in models you can check the label_path file to see which labels that are available, see commands below.

Darknet ```docker exec -it viseron cat /detectors/models/darknet/coco.names```
EdgeTPU ```docker exec -it viseron cat /detectors/models/edgetpu/labels.txt```

The max/min width/height is used to filter out any unreasonably large/small objects to reduce false positives.


Motion detection

Config example
motion_detection:
  interval: 1
  trigger_detector: true
  timeout: true
  max_timeout: 30
  width: 300
  height: 300
  area: 0.1
  frames: 3
Name Type Default Supported options Description
interval float 1.0 any float Run motion detection at this interval in seconds on the most recent frame.
For optimal performance, this should be divisible with the object detection interval, because then preprocessing will only occur once for each frame.
trigger_detector bool true True/False If true, the object detector will only run while motion is detected.
timeout bool true True/False If true, recording will continue until no motion is detected
max_timeout int 30 any integer Value in seconds for how long motion is allowed to keep the recorder going when no objects are detected.
This is to prevent never-ending recordings.
Only applicable if timeout: true.
width int 300 any integer Frames will be resized to this width in order to save computing power
height int 300 any integer Frames will be resized to this height in order to save computing power
area float 0.1 any float How big the detected area must be in order to trigger motion
frames int 3 any integer Number of consecutive frames with motion before triggering, used to reduce false positives
logging dictionary optional see Logging Overrides the global log settings for the motion detector.
This affects all logs named lib.motion.<camera name> and lib.nvr.<camera name>.motion

TODO Future releases will make the motion detection easier to fine tune. Right now its a guessing game


Recorder

Config example
recorder:
  lookback: 10
  timeout: 10
  retain: 7
  folder: /recordings
Name Type Default Supported options Description
lookback int 10 any integer Number of seconds to record before a detected object
timeout int 10 any integer Number of seconds to record after all events are over
retain int 7 any integer Number of days to save recordings before deleting them
folder path /recordings path to existing folder What folder to store recordings in
extension str mp4 a valid video file extension The file extension used for recordings. I don't recommend changing this
global_args list optional a valid list of FFMPEG arguments See source code for default arguments
hwaccel_args list optional a valid list of FFMPEG arguments FFMPEG encoder hardware acceleration arguments
codec str optional any supported decoder codec FFMPEG video encoder codec, eg h264_nvenc
filter_args list optional a valid list of FFMPEG arguments FFMPEG encoder filter arguments
logging dictionary optional see Logging Overrides the global log settings for the recorder.
This affects all logs named lib.recorder.<camera name>

A default ffmpeg encoder command is generated, which varies a bit depending on the Docker container you use,

For Nvidia GPU support in the roflcoopter/viseron-cuda image
ffmpeg -hide_banner -loglevel panic -f rawvideo -pix_fmt nv12 -s:v <width>x<height> -r <fps> -i pipe:0 -y -c:v h264_nvenc <file>
For VAAPI support in the roflcoopter/viseron-vaapi image
ffmpeg -hide_banner -loglevel panic -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -f rawvideo -pix_fmt nv12 -s:v <width>x<height> -r <fps> -i pipe:0 -y -c:v h264_vaapi -vf "format=nv12|vaapi,hwupload" <file>
For RPi3 in the roflcoopter/viseron-rpi image
ffmpeg -hide_banner -loglevel panic -f rawvideo -pix_fmt nv12 -s:v <width>x<height> -r <fps> -i pipe:0 -y -c:v h264_omx <file>

This means that you do not have to set hwaccel_args unless you have a specific need to change the default command (say you need to change h264_nvenc to hevc_nvenc)


MQTT

Config example
mqtt:
  broker: mqtt_broker.lan
  port: 1883
  username: user
  password: pass
Name Type Default Supported options Description
broker str required IP adress or hostname IP adress or hostname of MQTT broker
port int 1883 any integer Port the broker is listening on
username str optional any string Username for the broker
password str optional any string Password for the broker
client_id str viseron any string Client ID used when connecting to broker
discovery_prefix str homeassistant Used to configure sensors in Home Assistant
last_will_topic str {client_id}/lwt Last will topic

Logging

Config example
logging:
  level: debug
Name Type Default Supported options Description
level str INFO DEBUG, INFO, WARNING, ERROR, FATAL Log level

Secrets

Any value in config.yaml can be substituted with secrets stored in secrets.yaml.
This can be used to remove any private information from your config.yaml to make it easier to share your config.yaml with others.

The secrets.yaml is expected to be in the same folder as config.yaml.
The full path needs to be /config/secrets.yaml.

Here is a simple usage example.
Contents of /config/secrets.yaml:

camera_ip: 192.168.1.2
username: coolusername
password: supersecretpassword

Contents of /config/config.yaml:

cameras:
  - name: Front Door
    host: !secret camera_ip
    username: !secret username
    password: !secret password

Benchmarks

Here I will show you the system load on a few different machines/configs.
All examples are with one camera running 1920x1080 at 6 FPS.
Motion and object detection running at a 1 second interval.

Intel i3-9350K CPU @ 4.00GHz 4 cores with Nvidia GTX1660 Ti

Process Load on one core When
ffmpeg ~5-6% Continously
viseron ~1.3-3% Scanning for motion only
viseron ~7.6-9% Scanning for objects only
viseron ~8.6-9.3% Scanning for motion and objects

Intel NUC NUC7i5BNH (Intel i5-7260U CPU @ 2.20GHz 2 cores) using VAAPI and OpenCL

Process Load on one core When
ffmpeg ~8% Continously
viseron ~3.3% Scanning for motion only
viseron ~7.5% Scanning for objects only
viseron ~8% Scanning for motion and objects

Intel NUC NUC7i5BNH (Intel i5-7260U CPU @ 2.20GHz 2 cores) without VAAPI or OpenCL

Process Load on one core When
ffmpeg ~25% Continously
viseron ~3.3% Scanning for motion only
viseron ~23% Scanning for objects only
viseron ~24% Scanning for motion and objects

Home Assistant Integration

Viseron integrates into Home Assistant using MQTT discovery and is enabled by default if you configure MQTT.
Viseron will create a number of entities depending on your configuration.

Camera entity
A camera entity will be created for each camera.
Default state topic: homeassistant/camera/{mqtt_name from camera config}/image
Images will be published to this topic with drawn objects and zones if publish_image: true is set in the config.
Objects that are discarded by a filter will have blue bounding boxes, while objects who pass the filter will be green.
Zones are drawn in red. If an object passes its filter and is inside the zone, it will turn green.
Motion contours that are smaller than configured area are drawn in dark purple, while bigger contours are drawn in pink.
Masks are drawn with an orange border and black background with 70% opacity.

Binary Sensors
A variable amount of binary sensors will be created based on your configuration.

  1. A binary sensor showing if any tracked object is in view.
    Default state topic: homeassistant/binary_sensor/{mqtt_name from camera config}/object_detected/state
  2. A binary sensor for each tracked object showing if the label is in view.
    Default state topic: homeassistant/binary_sensor/{mqtt_name from camera config}/{label}/state
  3. A binary sensor for each zone showing if any tracked object is in the zone.
    Default state topic: homeassistant/binary_sensor/{mqtt_name from camera config}/{zone}/state
  4. A binary sensor for each tracked object in a zone showing if the label is in the zone.
    Default state topic: homeassistant/binary_sensor/{mqtt_name from camera config}/{zone}_{label}/state
  5. A binary sensor showing if motion is detected.
    Default state topic: homeassistant/binary_sensor/{mqtt_name from camera config}/motion_detected/state

Switch
A switch entity will be created for each camera.
The switch is used to arm/disarm a camera. When disarmed, no system resources are used for the camera.
Default state topic: homeassistant/switch/{mqtt_name from camera config}/state
Default command topic: homeassistant/switch/{mqtt_name from camera config}/set\


Tips

  • If you are experiencing issues with a camera, I suggest you add debug logging to it and examine the logs

Ideas and upcoming features

  • UI

    • Create a UI for configuration and viewing of recordings
  • Detectors

    • Pause detection via MQTT
    • Move detectors to specific folder
    • Allow specified confidence to override height/width thresholds
    • Dynamic detection interval, speed up interval when detection happens for all types of detectors
    • Implement an object tracker for detected objects
    • Make it easier to implement custom detectors
  • Watchdog Build a watchdog for the camera process

  • Recorder

    • Weaving, If detection is triggered close to previous detection, send silent alarm and "weave" the videos together.
    • Dynamic lookback based on motion
  • Properties: All public vars should be exposed by property

  • Docker

    • Try to reduce container footprint

https://devblogs.nvidia.com/object-detection-pipeline-gpus/


Buy Me A Coffee
Donations are very appreciated and will go directly into more hardware for Viseron to support.

About

Self-hosted NVR with object detection

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 85.7%
  • Dockerfile 14.3%