adtl – another data transformation language#
adtl is a data transformation language (DTL) used by some applications in Global.health, notably for the ISARIC clinical data pipeline at globaldothealth/isaric and the InsightBoard project dashboard at globaldothealth/InsightBoard
adtl is currently a prototype and is subject to major revisions
Motivation#
Most existing data transformation languages are usually in a XML dialect, though there are recent variations in other file formats. In addition, many DTLs use a custom domain specific language. The primary utility of this DTL is to provide a easy to use library in Python for basic data transformations, which are specified in either a JSON or TOML file. It is not meant to be a comprehensive, and adtl can be used as a step within a larger data processing pipeline.
AutoParser#
AutoParser provides a semi-automated method for writing the transformation files required by ADTL, by using LLMs for field and value mapping. This reduces the need for users to write JSON/TOML specification files from scratch by hand.
Getting started
Specification
AutoParser
Module reference