
At the recent PHUSE US Connect in Orlando, Clymb Clinical presented a workflow for TFL Automation. One slide from the presentation (screenshotted below) deserves a closer look, as it brings together the key components enabling this automation. By combining TFL Designer, CDISC’s Analysis Results Standard (ARS), the siera R package and Analysis Display Metadata, TFL automation becomes achievable. In this article, we will dive deeper into how this works, by looking at each of the components from the workflow below:

1. TFL Designer
Developed by Clymb Clinical, TFL Designer is a modern, web-based interface for designing TFL shells and capturing structured metadata. It offers point-and-click functionality, intuitive UI, built-in security, user roles, and many more features to ease the shell creation and management process. Importantly, it captures metadata such as:
- population sets
- data subsets/”where clauses”
- grouping variables
- and definitions of statistical operation to be performed
Once shells are designed and metadata captured, two key metadata components are exported, which will be explained in more detail later:
i. The explicitly captured metadata mentioned above is structured in the backend into the format specified by the Analysis Results Standard (ARS) and exported as either JSON or Excel files.
ii. The information implicit in the shells’ layout – like row and column names, headers and footers, labels, etc. – is exported as Analysis Display Metadata (ADM) and exported as a JSON file.
2. Analysis Results Standard (ARS) Metadata
The Analysis Results Standard (ARS), developed by CDISC and published in April 2024, defines a Logical Data Model that links all metadata components needed to describe the process for generating results related to a Reporting Event (e.g., a Clinical Study Report [CSR]). Metadata is primarily structured in JSON format but can also be represented in Excel for ease of use. While ARS contains many components, four are particularly essential for defining how results are generated:
i. Analysis Sets: Each result in an output is subject to an analysis set (e.g. Safety Population). ARS metadata typically describes the corresponding “where clause” (e.g., SAFFL = "Y").
ii. Data Subsets: Further data subsets are specified as needed. For example, Treatment-emergent Adverse Events would likely have “TEAEFL = “Y”” in the ARS metadata.
iii. Analysis Grouping: Datasets (typically ADaM datasets) are grouped by one or more variable(s) to produce the desired result. (e.g., grouping by Treatment variables for tables where Treatment is displayed across columns).
iv. Analysis Methods: Once datasets are filtered and grouped, some analysis operations/functions would be performed on the data, captured in the Analysis Methods component of the ARS metadata. For example, summary statistics like mean or median, comparative statistics like p-values or counts and percentages for categorical variables.
The ARS metadata exported from TFL Designer contains all the necessary information for automating the generation of the results and can be ingested downstream as a JSON file.
3. siera R Package
The exported ARS metadata is ingested by an “automation engine” for downstream automation. One such “engine” is the siera R package, which takes ARS metadata as input and – instead of directly generating the results – meta-programs R scripts (one R script per output defined in the ARS metadata). Each generated R script contains all necessary code, based on the ARS definitions of Analysis Sets, Data Subsets, Analysis Groupings and Analysis Methods, to produce the results for the relevant output. The R script also references the applicable ADaM or dataset on which the results will be generated (also specified in the ARS metadata) and provides a call to the dataset as part of the R script. The main function of the siera package is called readARS, and takes 3 parameters:
- path to the ARS JSON file
- location where R scripts should be produced
- path to the folder containing ADaM datasets
4. ADaM datasets
ADaM datasets serve as the foundation for generating the results for TFLs. Data transformations and calculations outlined in the ARS metadata are intended to be performed on ADaM datasets (specified within the metadata itself). Although the ADaM datasets are not directly ingested by siera package, their paths are specified to be called in the auto-generated R scripts.
5. Auto-generated R Scripts
The R scripts produced by the siera package contain ready-to-run R code that generates results as specified in the ARS metadata. These scripts are inspectable, promoting transparency throughout the results generation process. Each R script produces one Analysis Results Dataset (ARD) corresponding to a one single output. The structure of the R scripts is consistent, regardless of the output, and follow the pattern of for each analysis in the ARS:
- Apply analysis set
- Apply data subset(s)
- Apply analysis groupings
- Apply Analysis Method
6. Analysis Results Dataset (ARD)
The result of running the R scripts is one ARD per R script (per output). Each ARD is structured with one result per row, using a primary key composed of analysisID, operationID, and grouping variable(s). Additional metadata related to the result can also be added to the ARD as desired (e.g. the Analysis Set used, Data Subsets, output ID, etc.).
But these are all traceable from ARS metadata using the primary key described above. The benefits of the ARD structure are that the results are in a structured format (as opposed to e.g. Table layouts, which could differ in structure for the same results), they’re machine-readable (unlike static RTF/PDF outputs) and re-usable. ARDs enable multiple downstream applications: As a result, ARDs can:
- support meta-analyses
- serve as structured input for GenAI models
- Be reformatted into various TFL output layouts.
7. Reformatting ARDs into Display Outputs
To reformat the ARD (containing all the results for a specific output) into the correct layout, display metadata is required to define elements such as column structure, row labels, titles, headers, and footnotes. This information is referred to as Analysis Display Metadata (ADM). In our TFL Designer workflow, the ADM is a JSON file containing such metadata which is configured in the backend of the software, based on the shell design as specified by the user in the point-and-click user interface. Being a structured metadata file, it contains positions for the actual results (as opposed to the “XX.XX” placeholders in shells). Results from the ARDs are programmatically inserted into the ADM, creating a combined JSON file that contains both display metadata and populated results. This can be rendered in or imported to the TFL Designer (or TFL Viewer) with the result being the desired output with results.
8. Final Output Rendering
Importing the JSON file with ADM and ARD values into TFL Designer, renders the actual output/display (Note: TFL Viewer, currently under development by Clymb Clinical, will extend this functionality to managing generated outputs with a familiar user experience). One can simply export the outputs as either RTF or PDF, completing the workflow of TFL generation.
As shown, TFL automation can be achieved using a foundational CDISC standard (ARS), open-source technology (siera R package) and modern, web-based software (TFL Designer). We are excited to continue this development, enabling more teams to streamline their TFL generation processes - resulting in reduced cost, increased efficiency and higher quality. If you are interested to learn more, visit https://clymbclinical.com/siera/.
Latest Blogs
By Shivani Gupta
June 17, 2025
Submify at PharmaSUG 2025: Reinventing Submission with a Dash of AI, Automation & Open-Source Solutions
Shivani Gupta’s latest article offers a behind-the-scenes look at her PharmaSUG 2025 experience and the launch of Submify. In her article, she explores how this AI-powered platform integrates CDISC CORE and automation to streamline submission workflows and improve data quality.
By Clymb Clinical
February 25, 2025
Clymb Clinical & CDISC: Transforming Clinical Data Standards with the eTFL Portal
At Clymb Clinical, we’re passionate about making clinical research more efficient, automated, and standardized. That’s why we’ve collaborated with CDISC to launch the first ARS-compliant packages in the eTFL Portal—a game-changing step in modernizing how analysis results are managed and shared.
By Malan Bosman
November 18, 2024
From Shells to TFLs: Leveraging Industrial Engineering Principles for Automated and Efficient Clinical Data Outputs
As an Industrial Engineer working as Statistical Programmer in the CRO industry, I’ve often been asked the question (by others as well as myself), “How did you end up here?” The role of Statistical Programmer has the reputation of being defined by rigid rules and regulations, following strict industry guidelines to produce pre-defined dataset structures and Tables, Figures, and Listings (TFLs).
By Navin Dedhia
September 25, 2024
Startup Innovation and Growth Through Microsoft Founders Hub
At Clymb Clinical, we’re committed to innovating and improving clinical data analysis and reporting by building our own software solutions and integrating existing technology stacks, including available open-source resources. As a startup, scaling efficiently is key, but as you scale you also have to carefully manage costs to sustain growth.
By Anna Yaggi, Clymb Clinical
September 19, 2024
Expect the Unexpected: Working at a Data Services Start-Up
During my undergrad, all I knew was that when I joined the workforce, I wanted to use my degree for good and work somewhere where I felt like I was making a difference. I didn’t want to get lost in a large corporate environment, but I didn’t necessarily think of myself as someone who would work at a start-up.