TRACY Objectives
The proposed TRACY solution in an open-source platform with main objective: to uptake an AI-based system, by running large scale pilots on LEA’s premises in an fully operational environment, in full respect of fundamental rights and personal data protection. For greater impact, the solution shall be validated by additional LEAs within the project, with the aim to be used after its completion. LEAs personnel shall be trained on project outcomes. TRACY will also
provide a realistic anonymized dataset and share it with EU LEA community.
Main Objectives
AI adaptation
As criminal acts become more and more technologically sophisticated, LEAs should keep up with the state-of-the-art technologies and AI applications, in order not to be left behind. Significant progress has been made with the exploitation and processing of non-content data provided by Electronic communications Service Providers (ESPs). LEAs reported that non content data played a determinative role in an investigation or for the prosecution of a case that targeted at a specific group of crimes. Non-content data have also been exploited for the corroboration or negation of allegations and testimonials in order to reinforce a case and other piece of evidence. The absence of data could serve as evidence as well. In other cases, the existence of meta-data could be helpful at the early stages of an investigation in order to exclude suspects, to identify more victims or potential perpetrators. Despite the progress that has already been made, LEAs reported that there is still a cluster of cases that cannot make use of non-content data in their favour. Crimes such as theft, organised and armed robbery and trafficking of stolen vehicles have the lowest percentage of non-content data exploitation. The primary reason behind this fact is the lack of knowledge in the domain of data processing and AI modelling on the part of LEAs. TRACY aims to close this gap by demonstrating new innovative ways of handling non-content data that could benefit LEAs to a great extent in dealing with such crimes more effectively. The information provided by the ESPs most of the time is in the form of metadata. Taking that into consideration, TRACY aims to develop advanced machine learning algorithms that fall under the broad spectrum of unsupervised learning such as dimensionality reduction, clustering, filtering or just discover hidden patterns that exist in the data. The use of those algorithms by the TRACY appliances will be an asset in the hand of LEAs significantly advancing the process of investigations and prosecutions. TRACY will include a TRL 8/9 six (6) month pilot phase implementation as well as a training process related to this pilot implementation.
Promoting prototypes to tools
TRACY aims to provide LEAs with a TRL-8,9 platform that will fulfil the strict criteria for scalability and security. Previous funded projects like ROXANNE (GA id 833635) and STARLIGHT (GA:101021797) targeted at delivering technology development up to TRL 4 to 6. Even though the contribution in advancing the methodologies of criminal investigations was significant by exploiting state of the art AI algorithms for text, speech and image processing, LEAs were not actually equipped with such tools as final products. TRACY focuses on the call’s main objective which is the closure of the gap between prototypes and real products that could be used by LEAs after the finalization of the project. Due to the nature of the platform and the sensitive information that it will be handling, the final version of TRACY solution will exist in a secured and monitored location inside the premises of LEA. An additional layer of security would be the exploitation of Single Points of Contacts (SPOCs) which are already in use in two Member States (DE, FR). SPOCs are considered a safe way of transferring sensitive data, since the transmitting point is also located inside the ESP’s premises and the communication channel is encrypted. For the scope of delivering a fully functional tool, there will be four versions of the TRACY solution (Development
/ DEV, Pre-production / PRE-PROD, Production / PROD, Demonstration / DEMO appliances). According to the call, the importance of promoting platforms which have proven their functionality in operational environments and offer a significant assistance in LEAs is stressed. Taking that into consideration, TRACY’s plan is to deliver a fully functional TRL-8,9 platform by organizing the development in three phases:
1. The first phase (System Development and prototype Demonstration Phase) will take part in a lab environment where the first instance of TRACY will be implemented (DEV appliance).
2. The second phase (Pilot Phase) targets to deliver the production environment and will be developed from a project technical expert team in collaboration with LEAs. During this phase, police practitioners (HPOL) will stress their needs and guide the technical team in any proposed modifications or advancements. The production environment consists of two instances, the pre-production and the production appliances, mainly for security purposes and rapid change management. These appliances shall fully
comply with the LEA’s security policies and shall be installed in their premises. They will be fully operational on real cases data for at least the last six (6) months of TRACY project.
3. The third phase (Validation Phase and Training) is where the validation is taken place, with the contribution of the rest of the consortium LEAs (DEMO appliance). Additionally, within the third phase a training environment shall be developed with the aim to standardise and to build AI capabilities within the LEAs.
Technology Transfer, Training and Complementarities with previous Project
The focus here is on technology transfer and complementarity mainly between ROXANNE (and other previous and running European
projects) and TRACY. It is highlighted that the work developed by TRACY, together with the experience of ROXANNE partners (IDIAP, KEMEA, HPOL), will
build on ROXANNE outcomes, with the intension to expand ROXANNE in use of non-content data (metadata).
IDIAP is the coordinator of ROXANNE, which significantly contributes towards this goal by bridging the strengths of speech and language technologies (SLTs), visual analysis (VA) and network analysis (NA). Although non-content data was partially supported in ROXANNE, its main deployment was to initiate the machine learning technologies to process contextual data. Since TRACY focuses entirely on non-content data (i.e, metadata), TRACY will build on ROXANNE results related to use metadata for the investigation of large criminal cases. By transferring the technology from ROXANNE, TRACY fills a gap and therefore brings complementary goals and results to ROXANNE.
On-premise Large-Scale Pilot
The TRACY methodology will be applied to several real cases on the Production environment running on LEA’s premises for a period of at least 6 months. The pilot will utilize the TRACY Algorithms applied to suitable telecom metadata case studies, to reveal the suspects involved in the
crimes investigated. During that phase, LEAs’ practitioners will be trained in using the tool and leveraging its full potential.
Real World Cases
The TRACY methodology will be applied to real world cases. Having obtained the appropriate training during the demonstration and pilot phases, LEAs will be ready to deploy the TRACY appliances on their own. TRACY appliance will be installed in a secured place inside the LEA premises. The aim of the project is permanent uptake of the solution among consortium LEAs, after the piloting phase, ensuring that the proposed solution fulfills the requirements of usability, efficiency and compliance with EU
standards as regards privacy and data protection.
Non-Discrimination and non-bias
The widespread use of machine learning technologies in making consequential decisions about individuals has been accompanied by increased reports of instances in which the algorithms and models employed can be unfair or discriminatory in a variety of ways. As a result, research on fairness (e.g. dealing with bias) in machine learning and statistics has seen rapid growth in recent years. and several mathematical formulations have been proposed as metrics of fairness for a number of different learning frameworks. TRACY is aware of the great importance in promoting fair predictions and mitigating possible discriminations and biases against minorities. Aligning with this direction, although the data that will be leveraged for the scope of this project are mainly metadata (thus not introducing biases), TRACY will make use of practices that promote fairness where needed and explore the broad landscape of fairness algorithms.
Conform to Legal Frameworks
The metadata lawfully obtained from ESPs are touching the boundaries of fundamental human rights such as the right to respect of privacy, since personal data are generated by human activity, more specifically the presence of the phone owner in a certain place and time, as
well as their trajectory over time and space. Our objective is to minimise the need for identifiable personal information to the bare minimum (necessary to achieve the objectives of TRACY), so our algorithms reach a satisfactory
performance. The true identity of the phones and their owner is not necessary for our methodology, until the suspect’s phones are located. Then, the normal release process of LEAs could be applied to obtain more information about phones and owners (CDRs, owner information, etc). The TRACY consortium includes partners specialised in legal aspects and ethics who will constantly monitor and address ethics, legal and social issues connected with the project at every stage of it. More importantly, TRACY development methodology, incorporates DPIA as an indispensable step of the iterative implementation process, so as to ensure that personal data are secure within the TRACY project life cycle.
Create Data Space for Security and LEAs
It is a primary objective of TRACY project to create and provide realistic but not real datasets based on the experience stemming from operational cases, and share that with the LEAs ecosystem. Metadata provided by ESPs are a vital part of the usual LEAs’ investigation methodology, to tackle organized crime and terrorism. Nevertheless, the vast and rapidly increasing amounts of these data, due to the continuous digital transformation (5G, IoT, smart phones, smart vehicles etc.), makes it extremely difficult or even impossible, for digital investigators to analyze them, especially in cases of crimes committed in dense urban areas. Additionally, LEAs use methods and evidence-gathering instruments and measures designed for physical evidence, which are inadequate to address emerging challenges in the digital world. Moreover, criminal networks and terrorists adopt new cryptographic methods of communications which are difficult for LEAs to intercept. In this investigative field at least, LEAs still have in place obsolete, non-suitable investigative methods and tools and consequently, in practice, digital investigators are discouraged from exploiting ESPs metadata to identify patterns and correlations, in cases of crimes committed in dense urban areas, even if they are the only source of evidence.
TRACY project shall run a large-scale AI-based pilot for at least 6 months that will be conducted by LEA (HPOL), installed in their premises in stand-alone secure environment (TRACY PREPRO & PROD appliances), using operational cases, to validate the technology maturity on real operational datasets. This is vital, since it will contribute to bridge the gap between the prototypes (i.e., typically up to TRL 7) and systems proven
in operational environment that bring clear value to police practitioners
(i.e., TRL 8/9). Additionally, TRACY concept, as an eEvidence Platform provides a significant role, for safeguarding fundamental rights (such as the right to respect for private life and right to data protection) and this is mainly
because the proposed solution is based on Telecommunication Providers’ non-content traffic data, which by their nature do not provide any specific
information on their subject and thus may be considered as a less privacy intrusive police investigation method, avoiding inaccurate, biased, or even
discriminatory outcomes.
The consortium does however recognize, that lawfully processing non-content communications data (such as traffic data) may nonetheless allow to draw precise conclusions in respect of private lives of individuals, and will therefore strive to ensure, that any processing of such noncontent data will be conducted in accordance with the applicable laws and regulations (i.e. ePrivacy Directive, and national laws) and subject to proper oversight. The case law of the Court of Justice of the European Union (CJEU) and the national courts of the relevant jurisdictions in the subject matter area
will be analysed and given due consideration. The consortium is aware, that TRACY might have a potential impact on human rights, and hence will carry out a human rights impact assessment. Human rights will also be addressed in the trainings of law enforcement personnel.
Taking data privacy and the compliance with the GDPR framework into sincere consideration, TRACY will be handling data that
contain only spatial information of the base stations serving each device in a targeted region for a limited period and not their exact location (base station locations are not confidential data). Consequently, TRACY algorithms will be using an estimation of the location of each active device and not the exact one.
TRACY solution consists of three ingredients: i) TRACY Methodology, ii) TRACY Platform and iii) TRACY appliances. The description and the details of each building block is provided is the following sections.
In the long term, our ambition, for TRACY as an e-Evidence Platform, is to enable LEAs, across the EU to uptake this AI solution to support their efforts for identifying networks of organised crime and terrorism, based on telecom metadata. The identification will be carried out by following the ΄digital traces’ that
mobile phones leave as they operate. The traces are also combined with collocated traces of other mobile smartphones, both in time and space, thus leading to potential identification of networks of organized crime and terrorism, even though no direct communication has been carried out between the suspects members of the network(s) under investigation.
TRACY e-Evidence methodology, though mature, has not been yet validated by LEAs in real case scenarios, since in most cases evidence-gathering instruments and measures designed for physical evidence are not yet fully adapted to the digital Big-Data ecosystem. Indeed, EC’s “Study on
the retention of telecom metadata for law enforcement purposes” highlighted that on average only 20% of the LEAs requests to the Telecommunication Providers, regarding large-scale non-content data have been proven
determinative, mainly because of lack of knowledge of how they could find relevant evidence within broad datasets. Indicatively, LEAs assume that in dense urban areas, their utility is particularly limited, but on the contrary, this is where TRACY Platform stands and aims to change that assumption.
From a data perspective, while TRACY project will make full use of real operational data in stand-alone LEA environments to assess, validate and better train AI systems, it will additionally gather and provide pseudo-operational data (anonymized datasets), able to be used to train, test and validate AI systems, thus contributing towards the creation of a Data Space for Security and law enforcement.