WP1 - Re-evaluation of available AI techniques and architectures

Identification of limitations / constraints and possibilities to improve the technology to be adopted in conjunction with the specific requirements.


The current activity was dedicated to the centralization of the requirements that the Beneficiary has in the current activities, with a specific objective: the validation of the identity of the persons. Particular attention was paid to the analysis of the preliminary requirements in order to identify the hypotheses of use and define the operational work scenarios, as follows: scenario 1: identification of persons - search in a database of identities; Scenario 2: Re-identifying a person based on a reference - searching video streams / videos.


During the research of the person re-identification methods, there have been collected and analyzed a series of methods from the current literature that perform re-identification with closed data set based on the analysis of static, dynamic features, peculiarities of gait and skeleton; a number of heterogeneous, noise-robust, semi-supervised or unsupervised re-identification methods, end-to-end detection and re-identification methods, and re-identification of groups of individuals. Methods for hiding and anonymizing faces were analyzed, the possibility of processing on embedded hardware platforms was analyzed, and a number of multi-modal fusion methods were studied to improve identification results. More than 100 selected bibliographic references from the literature were reviewed.


The activity was dedicated to the analysis of sensors and video data sources, especially from the perspective of data models and standards to be implemented so that the solution covers as many video sources as possible, being agnostic to the type of existing video from the aforementioned point of view. Following this activity, the standards to be used for connecting video sources (ONVIF) were determined.


The activity aimed at analyzing the legislative framework in force, regarding the legal methods of acquisition and processing of personal data by the elements of the system to be implemented and how to secure the data flows that are conveyed by its elements. The main aspects of the study are the following: Presentation of the legislative framework on the protection of individuals concerning the processing of personal data and on the free movement of such data; Documentation on security techniques and mechanisms applied in communication networks; Documentation and analysis of available security services to ensure the security of an information system; Presentation of the cryptographic algorithms used to ensure the confidentiality and integrity of data; Propose methods for securing network communications that can be used in the project; Propose methods for processing video and image recordings, such as blurring portions of video recordings; Propose methods for securing information stored in databases: measures to protect access, measures to ensure data integrity, measures to ensure data confidentiality.


During the activity, we analyzed existing architectures, with the corresponding advantages and disadvantages. Two distinct working topologies were identified: (i) server-side processing, (ii) on-the-edge processing. Centralized systems helped the early networks in their development and were the only option before the emergence of decentralized systems, that were based on new hardware developments allowing proper processing performance even for nano-PC devices, embedded camera CPU, etc. Less prone to malfunctions and providing faster access times, decentralized systems have seen a considerable improvement over older systems. The developed solution will ensure compatibility with both architectures, often in complex projects co-existing a mix between these working topologies. Following the study carried out in this activity, it was concluded that the system created must support both types of architectures. This is actually a mixed architecture, which is based on both a distributed and a centralized processing component, minimizing the weaknesses of each topology and highlighting the strengths.


We identified 32 databases as follows: 12 image databases, 4 occlusion databases, 4 attack databases, 2 RGB-IR spectrum databases, 9 video databases, and a textual-visual database. The most important details about the selected databases were presented, and the main modern methods with top results on the selected databases were presented, showcasing the results of over 175 systems for the selected databases, most of them being recent works.

WP2 - Design, development and implementation of innovative algorithms and software modules for the final solution


In this activity, the requirements for the AI algorithms implemented in the project were detailed and the high-level specifications for the implemented software modules were finalized. For each individual module, the programming languages with which the implementation was carried out, the software libraries used in the development, as well as the hardware characteristics of the platforms on which their implementation and testing were carried out were detailed. Six main software modules are defined that ensure both the functionality of identifying and re-identifying people using silhouette characteristics, but also modules that ensure the security of the processed data and their faster processing.


AI algorithms used for static feature processing, dynamic feature processing, textual-visual information retrieval, data fusion, query augmentation and onboard processing were implemented. 25 algorithms were thus designed and implemented to cover the project scenarios, being optimized on test data.


The data security software module was created by ensuring the anonymity of video streams (face and body masking), the protection of personal data (selective encryption) and the security of data transmitted online via an IPv4 network (prototype with Teensy board).


Video source management modules (analog, digital, static or mobile) were created for the acquisition and processing of video streams from video sources, in real-time (RTP, RTSP, ONVIF S protocols, G/T profiles, WSN), with vertical (docker container) and horizontal scalability.


The user interface module through the KVision NVR system was implemented together with all the processing facilities related to the Integrated Processing System: processing system selection, information visualization, alarm definition and system parameterization.


The user management module was created by implementing access control mechanisms on the rules-based DAC model. A system administrator can grant access to operators and users as needed. Access control mechanisms use Active Directory technology and the LDAP protocol - Lightweight Directory Access Protocol.


Two types of real scenarios were defined and recorded on video, namely: (i) static characteristics (e.g. clothing, objects, posture), (ii) dynamic characteristics (e.g. movement dynamics, posture). Various video sources, shooting conditions and identities are used.


TRL5 validation of the algorithms was performed using 9 databases as well as recorded data. The obtained results approach a 90% detection performance. Further solutions for advanced optimization and improvement are identified, which will be implemented in the final stage of the project.


AI algorithms were integrated into the Integrated Processing System and their functionality was validated, according to the specifications.


Based on the results of the activities of the current stage, test procedures were developed for each of the relevant work scenarios. The testing techniques included checks on how the tested scenarios are suitable for use, establish inputs, set constraints, execute processes and estimate the quality of outputs.

WP3 - Development of the system demonstrator. Testing and validation


The project aims to enable real-time identification and re-identification of individuals in video streams, using silhouette biometrics, for enhanced security and public protection. It utilizes state-of-the-art technologies and software modules to ensure high-performance results. GDPR compliance and data confidentiality are key considerations. The activity discusses access control modules, detailing their functionalities and actions within the KVision user interface, ensuring effective access control in the final system.


This activity describes the development of testing procedures for the final solution, specifically related to relevant work scenarios. It involves identifying the relevant testing scenarios, describing the techniques used for verification, considering the testing of these scenarios, and outlining the testing procedures for each scenario. The activity aims to ensure the proper testing and validation of the solution’s functionality and performance. An integrated platform facilitates data processing by interconnecting modules and handling audio/video streams from diverse sources. The user-friendly application allows operations on data streams, such as opening media files or live sources. Multiple video processing algorithms can be applied within a single module, and alarms generated by these algorithms are transmitted to the user. Testing techniques involve verifying scenario suitability, setting inputs, establishing constraints, executing processes, and assessing output quality. The project’s internal testing procedure aligns with relevant scenarios and verification methods.


To achieve real-time identification and re-identification of individuals in video streams, various software modules and advanced algorithms utilizing neural networks were developed, including the “privacy mask” module, video source management module, and User Account Control module. The integration of these modules and algorithms was facilitated through REST API calls, which allow communication between the client and server using HTTP methods. The “privacy mask” module ensures compliance with GDPR regulations and data confidentiality by providing encryption tools for masking recorded video streams and encrypting real-time data. The User Account Control module manages user access and rights to minimize access to personal data acquired from video sources. Docker containers were used to package the developed algorithms, and REST API calls were used for communication between the client and server. Four algorithms were implemented: Dynamic features detection, Static global features detection, Static local features detection, and Search Augmentation. This activity involved testing and validating the integrated modules and algorithms within the KVision user interface, showcasing their respective functionalities to the user.


This activity focused on integrating a hardware-software prototype system for the developed solution (TRL7). The aim is to create a prototype that brings innovation to the security and public protection field. The main tasks included integrating software modules and AI algorithms into the KVision user interface, testing the prototype system to ensure its functionality, and describing its features. This involved explaining the registration and authentication process, showcasing the capabilities of the KVision interface for administrators and operators, and discussing the User Account Control module’s different functionalities for each user type. The processing of video streams and the selection of algorithms were covered, along with result generation and user notification for triggered alarms. The activity concluded with testing the TRL7 prototype system and describing its capabilities through the KVision interface.


This activity involved conducting the final testing of the integrated solution in real or representative conditions for real work scenarios. The results of testing the functionalities of the DeepVisionRomania integrated platform were presented. This platform is designed to process audio/video data streams from various sources and transmit the generated results after data processing. The testing followed the procedures developed in Activity III.2, which included login operations, application configuration, adding media files and live sources, adding and stopping instances, deleting instances, and closing sources. The testing results, along with observations regarding the platform’s functioning and response to operations, were also reported. Based on the testing activities, it can be concluded that the testing platform meets the requirements set by the client and the performance parameters specified in the requirements.


This activity describes the final testing and validation of the solution. It involves presenting the results of testing the functionalities of the integrated DeepVisionRomania platform and the video stream security solution. The testing is conducted based on specific test scenarios, in collaboration with the beneficiary. The activity includes tests such as generating alarms using available algorithms, blurring faces and individuals, and evaluating the video stream security solution. The results of the testing confirms that the platform meets the requirements set by the beneficiary and the specified performance parameters. The validation is performed at TRL7 (Technology Readiness Level 7), indicating a high level of maturity for the solution.


This activity involved the integration and validation of six innovative modules. These modules are: (i) Static Features Processing, which utilizes local and global representations to detect and reidentify individuals accurately; (ii) Data Fusion, which combines relevant information from different sources for more precise results; (iii) Dynamic Features Processing, which analyzes dynamic characteristics to extract valuable insights about people’s behavior and movement in surveillance images; (iv) Textual-Visual Information Retrieval Pilot Module, which generates detailed textual descriptions of individuals using CLIP neural network; (v) Query Augmentation, which improves search results by adding additional information to queries; and (vi) Embedded Processing, which enables fast and efficient data processing by distributing the workload across the security network. Each module was presented and analyzed in terms of their utility in identifying individuals of interest, implementation details, video data processing, and the main processing steps involved. The qualitative results obtained from testing and validating with various databases from current literature or created by project partners were presented. Finally, the limitations of the systems or modules and possible ways to mitigate these limitations were discussed for each module.


This activity focused on the development of documentation and the exploitation strategy for the project’s results. The KVision platform underwent enhancements through the addition of new modules and the incorporation of additional functions to existing software modules. The project team ensured the documentation of these modifications, adhering to best practices in software development and user documentation, such as the User Manual. Market analysis and financial considerations concluded that the tools developed within the project align with the increasing and challenging requirements of potential beneficiaries in the field of national security and private market sectors. Both the components and the integrated system developed in the project can be exploited as products to generate future benefits for end users, as well as for the manufacturers and team members. The involvement in this research and development (R&D) project brings significant advantages to Softrust company, academic partners, and the overall market. Here are some of the identified benefits based on the achieved results and the successful collaboration of the consortium members:

  • Innovation: By investing in R&D, Softrust has developed innovative modules, products, and services that meet consumer needs or have the potential to create new markets;
  • Increased competitiveness: R&D investments allow Softrust to gain a competitive advantage by developing products or services superior to those currently available in the market;
  • Increased company value: R&D investments are generally viewed as a long-term increase in the company’s value through the development of intangible assets such as intellectual property;
  • Talent attraction and retention: Talented and innovative employees are attracted to R&D activities and projects as they offer opportunities for personal and professional development;
  • Social and economic impact: By developing products and services that meet societal needs and by fostering knowledge within the consortium members, there will be a positive impact on the environment and the economy.

The solution developed in the DEEPVISIONROMANIA project is modular and scalable. The advantages arising from the modularity of the solution offer multiple possibilities for exploitation and utilization, customization based on specific needs, and a foundation for the development of future solutions by each consortium member. In the case of applications in security areas or other private domains, the market size is increasing. Furthermore, the DEEPVISION tools can also be used for critical infrastructure protection, industrial zones, as well as in the retail and banking sectors. Given the current context, the demand could further increase, and the product can find niche applications.