1 Introduction
1.1 Purpose
1.1.1 Introduction
This chapter introduces the Analysis Data Model (ADaM) Structure for Occurrence Data (OCCDS)
From a programmer’s viewpoint, OCCDS is a specialized data structure within CDISC standards designed for analyzing “occurrence data,” which involves counting subjects based on specific records or terms, often within a hierarchical dictionary coding system.It is primarily used for adverse events, concomitant medications, and medical history data .
Key Takeaways for Programmers:
- Purpose & Scope: OCCDS is distinct from the ADaM Basic Data Structure (BDS). While BDS uses
PARAM
andAVAL
(Parameter and Analysis Value) for analysis, OCCDS is specifically for data where occurrences are counted, andAVAL
orAVALC
are not needed.
This means your SAS programmers will focus on counting logic rather than deriving numeric analysis values.
- Dictionary Reliance: Occurrence data heavily relies on coding dictionaries (e.g., MedDRA, WHO Drug) with structured hierarchies.
A dictionary is often used for coding the occurrence and typically includes a well-structured hierarchy of categories and terminology. Remapping this hierarchy to BDS variables PARAM and generic *CAT variables would lose the structure and meaning of the dictionary. Per the SDTM Implementation Guide (SDTMIG) v3.3 (https://www.cdisc.org/standards/foundational/sdtmig), a dictionary is expected for adverse events (ADAE) and concomitant medications () and recommended for medical history. Although not as common, clinical events, procedures, and substance use may also be coded.
Data for a particular study that could have been coded but was not should use this structure because analysis results are similar,and this will allow analysis programming to work the same way—for example, medical history data might be coded in one study and not coded in another, and yet the analysis tables look very similar.
Programmers should be prepared to handle and leverage this hierarchy for analysis, as remapping it to generic BDS variables would lead to loss of meaning.
- Data Integrity: The raw content of occurrence data (like dictionary terms) is typically not modified for analysis.
This implies that your programming efforts will largely involve adding analysis-specific attributes and flags rather than altering core descriptive data.
- Relationship to SDTM: OCCDS is built upon the Study Data Tabulation Model (SDTM) Implementation Guide (SDTMIG) v3.3 nomenclature. The primary source for an OCCDS dataset is typically an SDTM domain and its Supplemental Qualifier (SUPP–) dataset, with additional variables from the Subject-Level Analysis Dataset (ADSL).
SDTM data contain –OCCUR = “N” records but these are not needed for analysis or denominators. In this case, –OCCUR = N records may be excluded from the OCCDS analysis dataset (Note: This example does not apply to the ADVERSE EVENT subclass defined in Section 3.1.2, SubClass ADVERSE EVENT).
The topic (e.g., an adverse event, concomitant medication) spans several treatment periods and needs to be counted in each. Based on the analysis need, a separate row might be required for each treatment period spanned and analyzed.
This means traceability back to SDTM is crucial for programming.
- Record Counts: While generally, an OCCDS dataset has one record per record in the corresponding SDTM domain, there are exceptions. Programmers should account for scenarios where records might be excluded (e.g., screen failures,
--OCCUR = "N"
records), or where multiple records are generated for a single SDTM record (e.g., an event spanning multiple treatment periods, or requiring analysis along multiple coding paths).
This does not mean that all categorical data are appropriate for OCCDS.
More standard categorical data that would never be mapped to a hierarchical dictionary,such as questionnaire responses, fit nicely in BDS and should not use OCCDS.
Typically, findings data fit nicely into BDS, and events and interventions fit nicely into OCCDS.
However, this is not always the case: Exposure data, from an interventions SDTM structure, is quite often analyzed in BDS because that analysis does not simply count records, although there could be an OCCDS intermediate dataset used to help derive those BDS summary parameters. In all cases, it is the combination of input data and analysis needs that determines the dataset structure required.
Metadata Class: Datasets using the OCCDS structure are assigned a metadata class of “OCCURRENCE DATA STRUCTURE”. This impacts how you define these datasets in metadata files like Define-XML.
Updates (v1.1): Version 1.1 introduces an “ADVERSE EVENT” subclass, new row identifier variables (e.g.,
SRCDOM
,SRCSEQ
) for stacked data, and a “U” prefix convention for unmodified SDTM variables that are stacked from multiple domains (e.g.,UBODSYS
instead ofAEBODSYS
orMHBODSYS
). New treatment-emergent flags and anADECODy
variable for grouping preferred terms are also included.
The document emphasizes that variable ordering is logical but not strictly mandated. Programmers should ensure datasets are “analysis-ready,” meaning they contain all variables necessary for replication of statistical tests.
Denominator counts for analyses typically come from ADSL, not the OCCDS dataset itself, as the OCCDS might not include all subjects (e.g., those without occurrences).