Description

AI-surveillance

SPECIFICATION

Autonomous device for comprehensive security

video blazer version "Forzet" series 3

Developed and manufactured at SpesLab

Standard version (other form factors available upon customer request):

Climate-protected enclosure – IP66, with no moving parts or ventilation holes, does not require special premises or cooling.

Dimensions: 165x127x71.

Application: indoor and outdoor, stationary and mobile (increased impact resistance) for use in transportation.

Operating temperature range:

Basic: -30°C to +25°C,
Ultra: -40°C to +40°C.

ULTRA version for hot regions

Computing Base:

Eight-core SoC processor - Rockchip RK3588 with four cores @ 2.4 GHz and four cores @ 1.8 GHz.
Quad Core A76 and Quad Core A55.
Coprocessors: =======, ====, ===. (Confidential information)
System memory - 8 GB LPDDR4x (possibly 16 GB).

Graphics Processor:

Arm Mali G610MC4.
Video outputs: 2x HDMI 2.1 up to 8K p60.
Hardware 3D smoothing filters when scaling: =====. (Confidential information)
Support for touch screens with full device control: YES.

Neural Network Platform:

NPU 6TOPS 3.0 (Neural Processing Unit), =========. (Confidential information)
Standard neural networks: based on YOLO v8, =================. (Confidential information)
Basic neural network size: “S”.

Codecs:

H.264/265 (hardware).
Data unpacking: 10-bit decoder - 8K.
Local streaming of IP camera streams over RTSP and network relay over HTTPS with overlaying meta-data in formats: JSON, ===== (variations are possible).
Hardware compression of general video in a complex format: 10-bit encoder - 8K.

Types of connectable cameras: IP, USB, some MIPI CSI.

Number of IP Channels:

2/4/8/12/16 (depending on design).
Basic kit supports 8 IP streams of FULL-HD quality at input and output.

WEB Interface:

HTTPS, obtaining H264/H265 video and archived video events together, meta-data in JSON (variations are possible).
API for third-party developers.

Cybersecurity:

Protected HTTPS protocol, two types of access by login-password, floating IP (complicates interception of the physical wire - administrative access to the router that knows the MAK is required).

Peripheral Ports (specialized versions can be expanded):

Ethernet RJ45 2.5 Gbit,
2 USB 3.0 Type A ports,
2 USB and USB 2.0 ports,
3 ports =========. (Confidential information)

Analog Audio:

Two-channel microphone,
Two-channel output to speakers.

Power Supply:

Voltage - 12 Volts. Standard power adapter - 220 Volts.
Current consumption at maximum load - 1.7 Amperes.

Automation peripheral (in the basic version):

4 sensors for opening and closing,
4 dry contact switches with the ability to switch loads up to 3 Amps,
12 Volt connector for powering sensors and/or cameras (maximum load 600mA).

Third-party device integration: via TCP/IP, RTSP, and RS-485 or via "dry contact" (for legacy systems).

Logic control software: VIDEO + peripheral with "SL++".

Operating system:

SL-Android 12 (modified Linux).

Reliability enhancement tool: Hardware Watchdog.

Basic software: SpecLab-VB 3.15 (multiple modifications).

Basic set of neural networks: human, face, car, truck, bicycle, motorcycle, tram/train.

Basic specialized neural networks: weapons (AK-47, pistol, rifle-shotgun), vehicle license plate identification, QR code recognition...

Basic set of behavioral neural networks: fall, hands up, one hand in the marked zone, tilt.

Recording capacity: Flash 128 GB (standard). Additionally, two USB drives (externally) or an IP array disk (externally) can be connected, or an internal SSD/HDD disk of 2 GB (can be larger, tested with this size).

APPLICATION

The multi-functional climate-controlled device “Videoblazer” with neural network object recognition can be used, depending on the program, as a video security system for any type of object, outdoor alarm system with a high degree of protection from natural interference, a video surveillance operator workstation, a tool for automating parking lots, an anti-terrorism security system with weapon detection, an adaptive “smart traffic light,” a component of the “Safe City” video surveillance system, a base for automatic collection of biometric facial data, a base for automatic collection of car license plates on roads, a structure for situational centers to display video streams on video walls, a system for monitoring technological processes, and for other purposes where there are video cameras, sensors, and control devices, as well as communication and notification means.

Software developers can utilize the integration API and write their own software for their specific tasks.

Neural network developers can use the annotation software for their systems.

TACTICAL AND TECHNICAL CHARACTERISTICS

Streaming Video

Only one stream (the best) from each IP camera is used for both viewing and recording. This has absolutely no impact on device performance and does not overload its processors - they are powerful enough to handle a large number of HD channels. It also unloads the network (no need to push a second channel) and the camera - its second stream can be used by other devices.

Real-time Video and Event Display:

A monitor connects to the device. This allows you to create a local video surveillance operator station, a home TV as a security monitor, a final monitor at the exit from your home, or a large-scale situational center video wall.

A monitor with a resolution of up to 8K is directly connected to the device via an HDMI connector. It can display both multi-channel video together with event panels, as well as each IP camera channel individually - in an enlarged view on the entire screen.

The basic version has 8 windows for live IP camera video and a bottom Panel for events. The Panel can consist of multiple strips, each of which can be configured with its own type of events.

Instead of displaying the entire frame as an event presentation, you can display an enlarged area within the frame where the searched object was detected. For example, a close-up of the face instead of the whole person with the full background.

You can also view the Event Log with excerpts from the frames where the recognized targets are acting. This reduces the amount of information the user needs to cover.

2. Network Video Viewing: Similarly, video streams are displayed over the network in any browser on any device. The web interface receives video streams with overlays from neural networks. The included API allows the use of metadata in other programs and applications.

Live video in both cases is downloaded to the device from the cameras and from the device to the web always in full resolution! Even if only one camera is being viewed at full screen or all cameras are in a smaller view. This is necessary for processing these streams by neural networks in good quality, as well as for other tasks and client connections. The videoblazer is powerful enough to handle a large number of high-resolution channels (the most frequent question - don’t worry!), it has hardware coprocessors, so the main processor is only 30-40% loaded in normal mode. All additional videoblazer features related to the logic of scenario processing do not intersect or suffer.

Performance amidst Multitasking:

A personal computer running Windows or a video recorder running Linux is fundamentally designed in a way that makes it impossible to control processor or memory load. Large operating systems themselves can unexpectedly start swapping memory, indexing files, or performing a whole range of their own service tasks. You’ve probably observed your computer’s CPU usage suddenly spike to 100%. Video streams can freeze for seconds - minutes, miss data, and deny access to clients. Multitasking is a bane from the standpoint of high-load video surveillance.

The videoblazer is designed so that each task is assigned a dedicated hardware component. Compression/decompression of codecs is handled by a separate chip, neural networks by another, and so on. The main processor is almost completely unloaded for general tasks. In standard mode, it only uses 30-40% of its capacity (you can check this from the operating system). Therefore, when a situation arises where complex logic is involved (evaluating a large number of factors, notifications, sending alarm data, monitoring its own parameters…), the videoblazer has a huge margin to perform all operations without delay.

However, the frame rate of viewing in the browser depends on the power of the computer or device where the browser is running - unfortunately, the H.264/265 codec requires resources for decompression. If the videoblazer has dedicated chips for unpacking video, the CPU will be loaded in a PC.

Scaling Deinterlacing:

For displaying images on a monitor directly connected to the videoblazar via HDMI, 3D image processing filters are used to eliminate negative effects. The videoblazer’s hardware achieves the highest image quality, and this high-load processing does not affect the main processor. When working with a browser, the graphics card of the device on which it is running is used. Without proper hardware support, the image may flicker, interpolate poorly, and exhibit other negative phenomena, primarily caused by scaling.

Excellent Option for Video Walls:

Videoblazers can serve as receiving equipment for situation centers, outputting video streams to large screens with resolutions up to 8K. Instead of noisy, cooling-requiring, expensive-to-maintain PCs with large graphics cards that take up a lot of space, videoblazars deliver video to monitors without these issues, decompressing it for viewing.

Video Client:

Any computer, even a weak one, can act as a video client, such as a cheap laptop, smartphone, or smart TV. This doesn’t require powerful graphics cards for neural networks, as the client receives pre-processed meta-information from neural networks, which is light textual data.

The videoblazer outputs live IP camera streams just as if they were being output by the cameras themselves, i.e., without re-compression. The JSON protocol (with options) is superimposed, and visual information about detected objects is mounted in the video in the form of geometric shapes with labels.

Any browser or more functional software can be connected to the videoblazer - the client part of the software for Windows is GOALcity (free for 4 cameras). The “Attention!” analytical panel can receive event-driven video clips.

Separate, partially free software is provided for tablet door phones when organizing event-based video surveillance in cottages. It is available for both Windows and Android tablets.

An open API protocol allows developers to connect their own devices and programs, and for any OS.

Metadata:

Metadata about all objects detected by neural networks with their location coordinates is provided along with the video stream. For building situational logic, live channels are not needed; it is sufficient to analyze meta-information. It is hundreds of times smaller in volume than live video, and fully conveys the formalized content of what is happening - receiving programs don’t need the video itself to perform the analysis.

VIDEO ANALYTICS

Event-based video analytics is presented in two main modes:

Recording EVERYTHING: If user-defined, useful objects are detected, recording occurs in N-second clips (for easy distribution). Full recording of the neural network archive with cyclic self-deletion. People and other inherently moving targets are always recorded; vehicles are only recorded when they are moving. Repeated frames (empty background) and interference are not recorded.
Initial Moments of Events: Determined by video semantics. N-seconds with each scene change caused by new objects or new behavior of old ones. (A person appears, another person or several appear, a car drives, a weapon is detected, more items similar to weapons are detected, a car license plate is detected, another or several license plates are detected, a QR code is scanned…) There are no activations for already detected targets unless they change their properties or movement patterns. For example, the same car’s license plate is not detected twice until it leaves the field of view for a set time - after reappearing, it will be detected again.

Logic of Using Events:

Each video event (appearance of a video clip) in both modes can act as a trigger to activate the user-programmed Logic algorithm. It can also be a composite condition of this logic.

Hard-coded Non-reconfigurable Mode:

Specialized versions (for specific orders) can have a dedicated reaction and recording mode. This simplifies logic configuration, but does not allow deviation from this logic. For example, immediate door closure on the first frame with a weapon.

In the basic version, it will take several seconds, and maybe even tens of seconds (depending on the settings), before a reaction occurs. This should be taken into account.

Unchangeable logic is also used in specialized products, such as Automatic Car Parking.

The user is only given a minimal number of settings for connecting devices, as well as tools for maintaining a database of car license plates.

The interface for systems that control traffic light objects or tools for recording traffic violations looks like a completely separate product.

In any case, the device’s purpose can be changed by requesting a different firmware.

For example, a home security system can be turned into an Automatic Car Parking system and vice versa.

Partial Visibility:

Neural network objects can overlap with a high degree of occlusion (in some cases up to 80%), while maintaining high recognition accuracy even in the perspective of distance.

A separate folder is designated for storing FACES for counting, long-term storage, simultaneous display, and search.

Neural Network Size:

The basic package utilizes a “S” category neural network size, which is sufficient in most cases. For complex outdoor objects with a particularly high number of disturbances, higher resolution neural networks of the “Extra Size” category are provided. For example, for perimeter security in forests with a large number of precipitation, birds, insects (crawling on cameras), snow swirls, vegetation, wind loads, etc. And all this over long distances!

Such increased sensitivity is often used for the security of government facilities, such as bridges, to maximize the use of the neural network for drone recognition to detect targets in the distance.

Neural Network Configuration:

SpecLab neural networks are developed based on artificial intelligence and therefore do not require special configuration. However, out of habit, the user is given an interface for changing recognition accuracy, object sizes, stable detection time, and masking individual areas.

Masking is the most sought-after feature, as interior figurines may fall within the field of view.

However, stationary objects can be ignored by the Videoblazer algorithms even without masking.

But dealing with dynamic images from a television is more complex.

Its screen is better to be completely masked.

Connectable Services and Clouds

Video recordings can be sent to the following destinations:

Telegram Messenger
Any regular BROWSER
“Attention!” Panel of the GOALcity program
Farwit cloud portal (Distant Witness)
Owner’s smartphone (currently through the cloud or “Attention!” panel)
Long-term event archive of the ACC
Client’s personal or corporate space via integrated protocol

Comprehensive LOGIC

Videoblazer can analyze both metadata from neural networks on video cameras and data coming from various sensors – in a comprehensive set of conditions for making decisions.

The user simply chooses the possible options for each element, described in simple human language.

The basic version offers 4 sensors with “dry contact” connection and evaluation of the line as closed or open. In specialized versions, protocols of third-party devices with IP or RS-485 connection are integrated, with evaluation of states of any level and type. This allows you to connect any systems, from turnstiles with intercoms to traffic light objects with anti-drone systems.

Device management works in the same way. The basic version has 4 powerful 220V relays on board. Third-party devices are integrated via IP or RS-485.

Logic programming is done in natural human language.

As if a person were telling what they want to do.

Cybersecurity

In addition to standard security protocols, videoblazer uses floating logic to change its IP address to protect against physical connection to the data network cable. Even if an attacker penetrates the object itself into the internal network, it is not enough to try to pick up the login-password on the device, you also need the login-password from the router, as the device is bound only through MAC. And this already requires a high level of skill if the admin is not asleep. You can only access the videoblazer through the router. Even if a dynamically changing IP address is somehow caught, it will change after some unknown time. Thus, an internal enemy will not be able to install a permanent device or program that tracks the videoblazer.

For highly protected objects, a specialized version is supplied without static IP and MAC addresses. The client program or browser extension is the only one that knows how to find the videoblazer on the network.

For Developers

Videoblazer can be supplied without firmware (at the price of hardware) or have a basic set of functions for software development companies. Based on a convenient and economical device, it is easy to create your own product with artificial intelligence capabilities. This does not require notifying the developer, you can take the protocols of the smart device here yourself.

Page updated

Google Sites

Report abuse