Open Source Tools and Methods for Open Source Investigations

The Sudanese Archive Identify, collect, and preserve digital content in a variety of formats from a variety of sources, and platforms. Content is stored in a centralised database and then accessed by our open-source tools to search, investigate, verify, categorise, and expand. Rather applying out-of-the-box technology solutions to the content challenges faced by the Sudanese Archive team, we take a user-centered approach to developing or contributing to custom open-source modular solutions (e.g. Sugarcube, allowing our teams of investigators to collaboratively work on our data pipeline.


tech diagram

Sources and Collection

Our database collects data from a list of sources in a variety of different source types. We acquire data and posts daily from these sources. Types of sources include social media channels (Twitter, Facebook, YouTube), submitted files (videos, pdf), and external and collaborator's data sets. Changes in these sources are tracked, meaning that all versions are saved.

Processing / Data Pipeline

Each unit of data in our database goes through our data pipeline. In this pipeline we detect the language, standardise the data format (but keep the old format as well), as well as perform other transformations. We screenshot and download the web page we received the information from.

Enigio Timestamp

Files that are in our database get both their md5 and sha256 hash. These get timestamped with Enigio Time - a third party collaborator.

Verification Tools

We use a mix of tools to verify our units. As data is never lost, we can use a variety of tools to verify or edit new fields in the data, and then merge them back together in our centralised database.

Currently, we use a mixture of Google Sheets and Check


debian / nodejs / sugarcube / python / mongodb / nginx / react

This website is flat html files. The database is react, calling to an api.

Sugarcube - Data Pipeline


Contact Us to talk about how we could help you with your project.