Questions and answers

Subscribe to AI

Tricky Questions in Video Surveillance Technologies

1. Why is this tiny box more powerful than massive video servers?

They’re not really servers—more like gaming rigs—because the term "server" is somewhat misused here. Servers and server-grade operating systems are designed for managing network traffic; their architecture is focused on data transmission accuracy, which makes them inherently slow due to repeated checks ensuring every byte is transmitted correctly. That’s why servers are poorly suited for fast video streams. For video surveillance, gaming PCs are more appropriate—they feature overclocked buses and powerful GPUs. After all, even a single stream of uncompressed video is too much for any network card to handle. That’s why video compression (typically via H.264 or H.265) is performed directly in the cameras. But once that compressed video hits the PC, it needs to be decoded somehow. How?

A powerful GPU is essential for intelligent video surveillance. Server platforms don’t even use GPUs, so in essence, video surveillance on a PC is purely a job for gaming-class hardware. It seems a bit ironic to rely on “toys” for something as serious as security.

2. Let's rephrase the question: why is this tiny box more powerful than personal computers?

If we measure performance in teraflops (TFLOPS), then no, it’s not more powerful. But we’re talking about a specific task—video surveillance. In this context, the main workload comes from decoding H.264/265 codecs, since all streams arrive over the network from cameras in a compressed format. And typical PCs don’t have dedicated chips for this kind of decompression at all.

The architecture of a PC is not designed for video surveillance, because in such a computer, decompression is done programmatically on the central processor (CPU). All streams queue up, slow down, interfere with each other, and as a result, the graphics card also works inefficiently, receiving data in chaotic logistics, even if codec hardware support is used.

In Videoblazer, each video stream has its own hardware chip for decompression, analysis, logic, compression, re-encoding, and everything else. This efficient architecture, first of all, wins in operation logistics, and secondly, it doesn’t suffer from slowdowns, as is the case with PCs.

Secondly, in Videoblazer, the processing power is not wasted on anything else. It doesn’t have to serve monolithic systems like Windows, which constantly “thinks” for seconds at a time. Open Task Manager on any Windows PC, and even when no computations are running, you'll see constant spikes in CPU usage, especially caused by antivirus programs. The Videoblazer has no hundreds of background tasks typical of PCs, such as indexing or disk caching, that slow down the entire system.

By the way, there are no viruses for the Videoblazer yet. For those seeking extra security, there’s a traffic encryption program available. Additionally, the basic package includes floating IP technology, which makes even accessing the system within the network problematic.

3. But most video servers on PCs support around a hundred video channels, while you only have 8 or 16. Doesn’t this seem too few?

Well, let's reveal the trick with the large number of video channels on personal computers! Even with an 8-channel Videoblazer, not to mention a 16-channel one, this number exceeds the capabilities of any PC. The reason is that the Videoblazer processes the primary streams from the cameras, which are of high resolution, FULL HD (1920×1080). On personal computers, you’re shown secondary streams, which rarely reach the outdated VGA or D1 (4CIF) format (640x480). In multi-camera mode, you don’t even notice the difference because the windows are too small. To give you the same level of quality for analysis as in FULL HD, you would need to install seven times more cameras.

Your neural network will be able to detect only one car instead of seven. It also won’t be able to read the license plate or the text on a surveillance camera.

Object Recognition Quality
1 Videoblazer channel = 7 PC channels

Power
1 Videoblazer with 8 channels = 1 PC with 56 channels

On a PC, you can multiply the channels, but this comes at the expense of reduced visibility and an increased number of cameras. On a PC, you will pay for 56 video recording licenses and 56 video analytics channels, while with VideoBlazer, everything is included.

"But in the recording, it’s always the primary stream, and it’s high resolution!" – this is what the tricksters say. But to record compressed streams, you don’t even need a PC—an IP disk is enough. So, what exactly are we comparing here? There are no real achievements in this. What’s much more important is to decompress these heavy streams for analysis by neural networks. Compressed streams cannot be analyzed. And here’s the truth: a personal computer can't handle this kind of load, primarily because it doesn’t have H.264/265 chips—only software decompression via the CPU. And the CPU simply can’t handle large volumes. Even with 8 channels, it starts to lag and drop data—just discarding frames.

"But there is the option to offload this task to the graphics card," our opponents argue. Theoretically, yes, but every frame will still be processed by the CPU—it manages the interaction with the graphics card through the bus and in a highly sequential manner. Moreover, based on our experience, no one has ever achieved reliable performance in this process. Typically, it ends with users being asked to either disable this support or use it in its most limited form.

If 8 camera channels are connected to a single device, it doesn’t mean the system is limited. Up to 1000 devices can be integrated into a single system and displayed on a single—even weak—computer as one software. No significant power is needed on the PC, as all tasks are handled by the Videoblazers. The PC only receives pre-processed data.

Moreover, this architecture is orders of magnitude more fault-tolerant. Firstly, even if any device fails, you only lose 8 channels. Second, the entire system isn’t tied to a single room or electrical outlet. It’s more logical to place the devices where the 8 or 16 cameras are located, making installation easier and ensuring higher independence. No vandal can instantly ruin everything.

4. Neural networks only work when connected to the network; will everything fail if the Internet goes down?

No, all the neural networks are autonomous and run on the Videoblazer itself. Disconnecting the data transmission network has no impact on them. In fact, Videoblazer is all about security. And security, by definition, is about reliability. That’s why the core functions all run locally.

Videoblazer offers increased reliability even during power outages. How is your current video surveillance set up—everything in one room, powered by a single outlet? Any random event or unauthorized person could take down everything! The Videoblazer boxes are no longer tied to specific rooms; they can be placed outdoors, mounted on poles, hung from ceilings, or even buried in basements. If one fails, only 8 cameras will be affected.

We hear your question: but isn't it more convenient to service everything in one place? You don’t need to service it anymore! There are no moving parts in Videoblazers, no fans, no openings for dust, not even a risk of water damage—you can spill a kettle of tea or coffee, and nothing will break. What exactly is there to service?

And if a grenade hits the Videoblazer, it’s easier to replace it quickly. Moreover, without any need for setup or dealing with operating system installations.

5. A follow-up question immediately comes to mind: You say that all the core functions work inside the unit, but is there anything, even non-core, that works only over the network—outside the unit?

At the moment, no, but it is planned. Specifically, we will integrate ChatGPT via the API provided by OpenAI. However, these will be service functions only, such as voice control, convenient archive search, creating analytical reports based on the smart archive, quick search for specific moments and their connections, and so on.

Currently, we are open to integration with any other devices. You can use our API or integrate any third-party protocol with us.

6. So, what exactly is the revolution?

The latest family of object detection models, ONNX, doesn’t just work faster and more accurately than its predecessors, providing excellent recognition even of small objects and in low light—though its creators modestly write little about it. The capabilities that YOLOv8 offers today, for us specialized lab developers, who were building neural networks back in the days of Tsar Gorokh, are truly a mind-blowing breakthrough. Computers have never been so close to human-like vision. Life moves forward, and undoubtedly, in a couple of years, we will marvel at something even more grandiose. But for now, ONNX already represents the ability of machines to perform tasks at the same cognitive level as humans. This means that it’s now possible to treat computer conclusions as human opinions. At this point, there’s no difference in who makes the decision in a given situation when it comes to visual perception of the world around us. Yes, mistakes are possible, but with the same probability, whether the decision-maker is a machine or a human.

Well, the one thing still not replicated is human intuition. Most likely, trust in machines will be undermined for a long time by this uncertain term. What is intuition? It’s a sense of information that lacks logic, presented as the highest form of informational accuracy. This is something that computers will hardly handle anytime soon. Only someone with pure intuition could grasp ‘that thing I don't know what it is.

And, just like in a fairy tale, luck is on our side—ONNX has seamlessly integrated into the latest chips, specifically designed for neural network computations. Now, all this power is packed into a small box that replaces an expensive computer with a bunch of outrageously costly GPUs for neural networks. You know very well, even from our pricing, how expensive video servers for neural networks can be.

The most important thing, of course, is not the budget—for security, you need the highest possible reliability and often the ability to operate in extreme conditions. You can’t hang huge video servers on poles or bury them in a damp barn, but the Videoblazer is a super device for any type of facility.

7. Judging by the price, it's not exactly cheap as chips, is it?

The question is incorrectly framed—cheap as chips compared to what? For the price of a Chinese DVR, it’s expensive, but compared to the nearest competitor—a powerful PC with three nVidia cards—it’s ten times cheaper. Let’s agree, this is not a penny-pinching video recorder. But the goal isn’t just to record how you’re going to be killed. Neural networks recognize threats, and that’s, as you’ll agree, an entirely different matter. Sometimes, it feels like in security, the main thing is to find someone to blame. Life is worth more than any proof of guilt.

Additionally, you’re saving a pretty penny on electricity—Videoblazers consume far less power than you’d expect.

8. Your presentation features a lot of neural networks, but in the price list, most of them require extra thousands of rubles. Do you think that's a bargain?

Looks like you're referring to the add-ons. The core software packages definitely include the essential neural networks that cover the needs of the vast majority of users—probably around 90% of people. The rest are intended for large enterprises that also require customization for their unusual or demanding conditions. That’s exactly why we’ve priced these separately—as a fee for localization and adaptation.

Naturally, as more of these custom-trained models become common, they’ll be included in the standard package. So our pricing should be seen as evolving—and not necessarily upward. Sometimes things get cheaper, not more expensive.

9. We are developers working with YOLO. Can we use your device for our own purposes with our custom neural networks?

Yes, we provide an API, not just for YOLO, but for any neural network based on the ONNX framework. You can simply buy the hardware — no software license required. Alternatively, you can develop your own extension on top of our base software, or use our training tool to create and train your own datasets.

10. We liked the idea of the Event Log. For security companies, this format might be convenient. But wouldn’t it be better to display all these cropped figures of people and vehicles as tiles filling the entire screen, updating as new events come in? Kind of like an image viewer. Don’t you think that would be a better approach?

Yes, the first version was designed with private security companies in mind — that’s the logic behind it. In the near future, there will be multiple options for displaying what we call the "instant archive" (though the name is still under wraps — so keep it secret!).

But the most exciting part is that, thanks to AI, Videoblazer will soon be able to clean up the background, making objects stand out more clearly — without distracting details. This work is already in progress, and if you purchase a device now, you’ll be able to upgrade for free once it's released.

11. Your promo about 8K sounds impressive, but as I understand it, that’s just for recording to disk, which doesn’t really require much processing power. What’s the actual resolution used for display and analysis? Like everyone else, do you use a secondary stream just under 1 MP, something like D1?

We're glad to see experts reaching out to us — true connoisseurs of technology. Until very recently, all intelligent video surveillance systems worked exactly as you described. Videoblazer is the first device to perform all operations using a single stream — and that stream is at maximum resolution. There’s no option in the settings to assign a second or third stream. It decodes one video stream, analyzes it, records it, and can re-encode it to generate video alerts. The same high-resolution stream is decoded and displayed on monitors — even in 8K. Our hardware-based chips handle all of this without adding load to the main processor — unlike conventional PCs.

By the way, the second and third streams from the cameras can still be used for other systems and tasks.

Of course, there’s a small catch. Our marketing refers to the total processing power — 8K, meaning that’s for a single channel. For typical installations, the system handles 8 Full HD channels or 12 HD channels. But let’s be fair — HD is four times the resolution of 1MP, and nearly seven times more than D1, which is still commonly used for video analytics.And when you actually see the system in action, it's truly impressive to witness how today’s cutting-edge neural networks perform object recognition at full HD quality, not on some tiny secondary stream where even the smartest AI wouldn’t stand a chance of finding anything useful.

By the way, we already have orders for an 8K camera for drones. Can you imagine the level of precision recognition it provides from above! It’s also extremely useful for assessing the situation at a smart intersection with just one camera, keeping everything in one place.

12. Videoblazer – it's still a computer, right? What exactly differentiates a PC-based video surveillance system from yours?

Even a simple TV is also a computer. The architecture hasn’t changed since the creation of humans or even animals: a processor, memory, power units, and a power system... :) But there are always differences.

Just as on a personal computer, you can create a wide variety of systems, from an accounting server to a video surveillance system, on a video blazer you can also organize hundreds of different task cases. The only difference between the video blazer and a PC is the hardware architecture — much more suitable for video processing. It has hardware decoding for each video channel, which positively impacts the quality of the video streams. On a PC, all channels are processed by a single CPU, and in general, almost everything is handled by the central processor, switching crazily between tasks, which is why the video constantly lags, stutters, and during operating system caching, everything freezes. The video blazer has independent chip channels for decoding, neural networks, analysis, sending meta-data tagged video content to client applications, compression, and so on. All the data is evaluated by the central processor of the video blazer, which, in its standard state, is never loaded beyond 30% of its power. Therefore, even during peak loads, the video blazer handles all tasks without delays. And if you check the CPU usage in a PC-based video surveillance system, you’ll always see it above 100% 😊

13. And yet, you wouldn’t claim that your neural chips are more powerful than nVidia cards, would you?

It would be foolish to claim that, anyone can compare our specifications, which are almost completely open. However, using super-expensive nVidia cards for most video surveillance tasks is like using a cannon to shoot at sparrows. Our neural chips are excellent at recognizing objects and tracking their movements – and that’s more than enough for most tasks.

Naturally, for complex logical chains, we also use nVidia ourselves. Clients periodically order us to develop highly original projects. But progress moves so fast that, over time, we adapt these complex neural networks to the new version of the VideoBlazer, and to the client’s delight, they discard the PC, replacing it with the Videoblazer.

New questions will be posted here...

Page updated

Google Sites

Report abuse