Data science
Open Source –Intelligence
analysis of (4
large
days)
datasets
(5 days)
This specialized training program is designed to equip law enforcement professionals with the skills necessary to tackle
the challenges of analyzing extensive digital datasets. Trainees will learn to navigate various data formats, types, and
sources, mastering the methodologies required for effective "big data" analysis.
Through a blend of theoretical knowledge and practical application, trainees will delve into the intricacies of
unstructured data, complex data structures, and meta-data analysis. The course emphasizes the importance of
processing high-volume, high-speed data, focusing on pre-selection, filtering, visualization, and the enhancement and
enrichment of data sets. It will also cover translation, relational, graph, object-oriented databases and cloud concepts.
Topics:
• Understanding the terminology and approaches to "big data," including common issues and potential
solutions.
• Optimization of data collection and management within classical and cloud-based environments.
• "what do i have"/dataswimming
• Techniques for data pre-selection, filtering, and visualization to enhance and enrich information.
• Strategies for dealing with unstructured data, complex data structures, and meta-data analysis.
• Processing high volume and high-speed data, addressing data quantity challenges.
• Translation and Optical Character Recognition (OCR) for unsearchable files.
• Exploration of structured and non-structured data within relational, graph, and object-oriented databases.
• Utilization of a toolbox for data mining and text mining, transitioning from keyword searches to Technology
Assisted Review (TAR).
• Statistical analysis of structured data using business intelligence applications, including data import and the
creation of visualizations and dashboards.
• Best practices for creating effective visual representations such as timelines, scatter plots, histograms, and
handling categorical data.
• Advanced dashboard creation featuring customized filters, data enhancement, and enrichment techniques.
• Methods for reducing lag, uncertainty, noise in data sets, and supporting decision-making processes.
This course is tailored to address the unique data challenges faced by law enforcement agencies, providing practical
skills and knowledge to enhance investigative capabilities and operational efficiency. Join us to become proficient in
the latest data science techniques and transform the way you analyze and interpret data within the field of law
enforcement. Trainees will gain knowledge of operative data science skills: data ingestion, data cleansing, data
processing, data storage. SQL und noSQL databases, visualization and visual analytics concepts and skills, processing
text data (named entity recognition, indexation, topic identification), processing large datasets (tabular data,
relationship and content (e-mail) data), using external services by API (e. g. text translation or object recognition in
images); understanding of data processing pipelines (ETL / ELT), data science project organization, data and AI
governance, data storage and manipulation, distributed and cloud computing, advanced visual analytics concepts
Prerequisites
Basic knowledge of programming, basics of python coding, profound knowledge of analytics and analytical thinking
(this is NOT a beginner’s course).
www.insig2.com
Training agenda per hours
Day/hours Data Science
Day 1
08:00 - 09:00 Opening ceremony for all groups
09:00 - 10:00 Data Sources & Data Types
10:00 - 10:30 Coffee break
10:30 - 12:00 Preparation of (Structured) Data I
12:00 - 13:00 Lunch
13:00 - 14:45 Data Visualization and Visual Analytics
14:45 - 15:15 Coffee break
15:15 - 17:00 Exercises – Data Preparation and Visualization
Day 2
08:00 - 10:00 Recap Day 1, ETL / ELT
10:00 - 10:30 Coffee break
10:30 - 12:00 Data Structuring & Data Engineering I (relational data)
12:00 - 13:00 Lunch
13:00 - 14:45 Practical Lessons – SQL
14:45 - 15:15 Coffee break
15:15 - 17:00 Exercises – SQL, Data Ingestion, Joins, …
Day 3
08:00 - 10:00 Recap Day 2, Data Science Project Workflow
10:00 - 10:30 Coffee break
10:30 - 12:00 Bulk Operations on Data, Distributed & Cloud Computing
12:00 - 13:00 Lunch
13:00 - 14:45 Practical Lessons – Working with pandas Dataframes, addressing remote
services via API
14:45 - 15:15 Coffee break
15:15 - 17:00 Exercise – Benford’s Law for detecting suspicious money transfers
Day 4
08:00 - 10:00 Recap Day 3, Data Science & Data Engineering II (noSQL)
10:00 - 10:30 Coffee break
10:30 - 12:00 Processing unstructured Data
12:00 - 13:00 Lunch
13:00 - 14:45 Practical Lessons – Document data bases and Named Entity Recognition
14:45 - 15:15 Coffee break
15:15 - 17:00 Exercise – using NER for indexing large text corpora in document databases
Day 5
08:00 - 10:00 Course Recap, Open Questions, (Optional) Outlook on (Generative) AI
10:00 - 10:30 Coffee break
10:30 - 12:00 Final Exam
12:00 - 13:00 Certification ceremony
13:00 - 14:00 Lunch
www.insig2.com