TFL Designer + ARS = TFL Automation!

By Malan Bosman

April 30, 2025

6 mins read

At the recent PHUSE US Connect in Orlando, Clymb Clinical presented a workflow for TFL Automation. One slide from the presentation (screenshotted below) deserves a closer look, as it brings together the key components enabling this automation. By combining TFL Designer, CDISC’s Analysis Results Standard (ARS), the siera R package and Analysis Display Metadata, TFL automation becomes achievable. In this article, we will dive deeper into how this works, by looking at each of the components from the workflow below:


1. TFL Designer

Developed by Clymb Clinical, TFL Designer is a modern, web-based interface for designing TFL shells and capturing structured metadata. It offers point-and-click functionality, intuitive UI, built-in security, user roles, and many more features to ease the shell creation and management process. Importantly, it captures metadata such as:

  • population sets
  • data subsets/”where clauses”
  • grouping variables
  • and definitions of statistical operation to be performed

Once shells are designed and metadata captured, two key metadata components are exported, which will be explained in more detail later:

i. The explicitly captured metadata mentioned above is structured in the backend into the format specified by the Analysis Results Standard (ARS) and exported as either JSON or Excel files.

ii. The information implicit in the shells’ layout – like row and column names, headers and footers, labels, etc. – is exported as Analysis Display Metadata (ADM) and exported as a JSON file.


2. Analysis Results Standard (ARS) Metadata

The Analysis Results Standard (ARS), developed by CDISC and published in April 2024, defines a Logical Data Model that links all metadata components needed to describe the process for generating results related to a Reporting Event (e.g., a Clinical Study Report [CSR]). Metadata is primarily structured in JSON format but can also be represented in Excel for ease of use. While ARS contains many components, four are particularly essential for defining how results are generated:

i. Analysis Sets: Each result in an output is subject to an analysis set (e.g. Safety Population). ARS metadata typically describes the corresponding “where clause” (e.g., SAFFL = "Y").

ii. Data Subsets: Further data subsets are specified as needed. For example, Treatment-emergent Adverse Events would likely have “TEAEFL = “Y”” in the ARS metadata.

iii. Analysis Grouping: Datasets (typically ADaM datasets) are grouped by one or more variable(s) to produce the desired result. (e.g., grouping by Treatment variables for tables where Treatment is displayed across columns).

iv. Analysis Methods: Once datasets are filtered and grouped, some analysis operations/functions would be performed on the data, captured in the Analysis Methods component of the ARS metadata. For example, summary statistics like mean or median, comparative statistics like p-values or counts and percentages for categorical variables.

The ARS metadata exported from TFL Designer contains all the necessary information for automating the generation of the results and can be ingested downstream as a JSON file.


3. siera R Package

The exported ARS metadata is ingested by an “automation engine” for downstream automation. One such “engine” is the siera R package, which takes ARS metadata as input and – instead of directly generating the results – meta-programs R scripts (one R script per output defined in the ARS metadata). Each generated R script contains all necessary code, based on the ARS definitions of Analysis Sets, Data Subsets, Analysis Groupings and Analysis Methods, to produce the results for the relevant output. The R script also references the applicable ADaM or dataset on which the results will be generated (also specified in the ARS metadata) and provides a call to the dataset as part of the R script. The main function of the siera package is called readARS, and takes 3 parameters:

  • path to the ARS JSON file
  • location where R scripts should be produced
  • path to the folder containing ADaM datasets

4. ADaM datasets

ADaM datasets serve as the foundation for generating the results for TFLs. Data transformations and calculations outlined in the ARS metadata are intended to be performed on ADaM datasets (specified within the metadata itself). Although the ADaM datasets are not directly ingested by siera package, their paths are specified to be called in the auto-generated R scripts.


5. Auto-generated R Scripts

The R scripts produced by the siera package contain ready-to-run R code that generates results as specified in the ARS metadata. These scripts are inspectable, promoting transparency throughout the results generation process. Each R script produces one Analysis Results Dataset (ARD) corresponding to a one single output. The structure of the R scripts is consistent, regardless of the output, and follow the pattern of for each analysis in the ARS:

  • Apply analysis set
  • Apply data subset(s)
  • Apply analysis groupings
  • Apply Analysis Method

6. Analysis Results Dataset (ARD)

The result of running the R scripts is one ARD per R script (per output). Each ARD is structured with one result per row, using a primary key composed of analysisID, operationID, and grouping variable(s). Additional metadata related to the result can also be added to the ARD as desired (e.g. the Analysis Set used, Data Subsets, output ID, etc.).

But these are all traceable from ARS metadata using the primary key described above. The benefits of the ARD structure are that the results are in a structured format (as opposed to e.g. Table layouts, which could differ in structure for the same results), they’re machine-readable (unlike static RTF/PDF outputs) and re-usable. ARDs enable multiple downstream applications: As a result, ARDs can:

  • support meta-analyses
  • serve as structured input for GenAI models
  • Be reformatted into various TFL output layouts.

7. Reformatting ARDs into Display Outputs

To reformat the ARD (containing all the results for a specific output) into the correct layout, display metadata is required to define elements such as column structure, row labels, titles, headers, and footnotes. This information is referred to as Analysis Display Metadata (ADM). In our TFL Designer workflow, the ADM is a JSON file containing such metadata which is configured in the backend of the software, based on the shell design as specified by the user in the point-and-click user interface. Being a structured metadata file, it contains positions for the actual results (as opposed to the “XX.XX” placeholders in shells). Results from the ARDs are programmatically inserted into the ADM, creating a combined JSON file that contains both display metadata and populated results. This can be rendered in or imported to the TFL Designer (or TFL Viewer) with the result being the desired output with results.


8. Final Output Rendering

Importing the JSON file with ADM and ARD values into TFL Designer, renders the actual output/display (Note: TFL Viewer, currently under development by Clymb Clinical, will extend this functionality to managing generated outputs with a familiar user experience). One can simply export the outputs as either RTF or PDF, completing the workflow of TFL generation.

As shown, TFL automation can be achieved using a foundational CDISC standard (ARS), open-source technology (siera R package) and modern, web-based software (TFL Designer). We are excited to continue this development, enabling more teams to streamline their TFL generation processes - resulting in reduced cost, increased efficiency and higher quality. If you are interested to learn more, visit https://clymbclinical.com/siera/.

Latest Blogs

By Shivani Gupta

June 17, 2025

Submify at PharmaSUG 2025: Reinventing Submission with a Dash of AI, Automation & Open-Source Solutions

Shivani Gupta’s latest article offers a behind-the-scenes look at her PharmaSUG 2025 experience and the launch of Submify. In her article, she explores how this AI-powered platform integrates CDISC CORE and automation to streamline submission workflows and improve data quality.

By Clymb Clinical

February 25, 2025

Clymb Clinical & CDISC: Transforming Clinical Data Standards with the eTFL Portal

At Clymb Clinical, we’re passionate about making clinical research more efficient, automated, and standardized. That’s why we’ve collaborated with CDISC to launch the first ARS-compliant packages in the eTFL Portal—a game-changing step in modernizing how analysis results are managed and shared.

By Malan Bosman

November 18, 2024

From Shells to TFLs: Leveraging Industrial Engineering Principles for Automated and Efficient Clinical Data Outputs

As an Industrial Engineer working as Statistical Programmer in the CRO industry, I’ve often been asked the question (by others as well as myself), “How did you end up here?” The role of Statistical Programmer has the reputation of being defined by rigid rules and regulations, following strict industry guidelines to produce pre-defined dataset structures and Tables, Figures, and Listings (TFLs).

By Navin Dedhia

September 25, 2024

Startup Innovation and Growth Through Microsoft Founders Hub

At Clymb Clinical, we’re committed to innovating and improving clinical data analysis and reporting by building our own software solutions and integrating existing technology stacks, including available open-source resources. As a startup, scaling efficiently is key, but as you scale you also have to carefully manage costs to sustain growth.

By Anna Yaggi, Clymb Clinical

September 19, 2024

Expect the Unexpected: Working at a Data Services Start-Up

During my undergrad, all I knew was that when I joined the workforce, I wanted to use my degree for good and work somewhere where I felt like I was making a difference. I didn’t want to get lost in a large corporate environment, but I didn’t necessarily think of myself as someone who would work at a start-up.

Accelerating Clinical Trials With Data-Driven Insights