Files
bridgehead/ccp/modules/exporter.md
2025-06-24 13:02:43 +02:00

29 KiB

Exporter and Reporter


Exporter

GitHub: https://github.com/samply/exporter

The Exporter is a REST API that enables the export of data from various bridgehead databases as structured tables. It currently supports only FHIR sources such as Blaze, but it is designed to be extended to other types of data sources. The Exporter provides multiple output formats, including CSV, Excel, JSON, and XML, and can also export data directly into Opal (DataSHIELD).

How it works

The user submits a query and specifies the desired export template and output format. The query acts like the WHERE clause in SQL, filtering data, while the template defines what data to select and how to format it, similar to the SELECT clause. The Exporter then processes this to generate the export files.

Environment Variables

Below is a list of configurable environment variables used by the Exporter:

Variable Default Description
APPLICATION_PORT 8092 Port on which the application runs.
ARCHIVE_EXPIRED_QUERIES_CRON_EXPRESSION 0 0 2 * * * Cron expression for archiving expired queries.
CLEAN_TEMP_FILES_CRON_EXPRESSION 0 0 1 * * * Cron expression for cleaning temporary files.
CLEAN_WRITE_FILES_CRON_EXPRESSION 0 0 2 * * * Cron expression for cleaning written files.
CONVERTER_TEMPLATE_DIRECTORY Directory containing conversion templates.
CONVERTER_XML_APPLICATION_CONTEXT_PATH Path to the XML application context used by the converter.
CROSS_ORIGINS Allowed CORS origins (comma-separated).
CSV_SEPARATOR_REPLACEMENT Character to replace CSV separators within values.
EXCEL_WORKBOOK_WINDOW 30000000 Memory window size for Excel workbook processing.
EXPORTER_API_KEY API key for authenticating access to the exporter.
EXPORTER_DB_FLYWAY_MIGRATION_ENABLED true Enable Flyway DB migrations on startup.
EXPORTER_DB_PASSWORD Password for exporter database.
EXPORTER_DB_URL jdbc:postgresql://localhost:5432/exporter JDBC URL for exporter DB.
EXPORTER_DB_USER Username for exporter DB.
FHIR_PACKAGES_DIRECTORY Directory where FHIR packages are stored.
HAPI_FHIR_CLIENT_LOG_LEVEL OFF Log level for HAPI FHIR client.
HIBERNATE_LOG false Enable Hibernate SQL logging.
HTTP_RELATIVE_PATH Relative base path for HTTP endpoints.
HTTP_SERVLET_REQUEST_SCHEME http Default HTTP scheme.
LOG_FHIR_VALIDATION Enable logging of FHIR validation results.
LOG_LEVEL INFO Application log level.
MAX_NUMBER_OF_EXCEL_ROWS_IN_A_SHEET 100000 Max rows per Excel sheet.
MAX_NUMBER_OF_RETRIES 10 Max retry attempts.
MERGE_FILENAME Name of merged output file.
SITE Site identifier for filenames/logs.
TEMP_FILES_LIFETIME_IN_DAYS 1 Lifetime of temporary files (days).
TEMPORAL_FILE_DIRECTORY Directory for temporary files.
TIMEOUT_IN_SECONDS 10 Default timeout (seconds).
TIMESTAMP_FORMAT Timestamp format string.
WEBCLIENT_BUFFER_SIZE_IN_BYTES 8192 Buffer size for web client.
WEBCLIENT_CONNECTION_TIMEOUT_IN_SECONDS 5 Connection timeout (seconds).
WEBCLIENT_MAX_NUMBER_OF_RETRIES 10 Max retries for web client.
WEBCLIENT_REQUEST_TIMEOUT_IN_SECONDS 10 Request timeout (seconds).
WEBCLIENT_TCP_KEEP_CONNECTION_NUMBER_OF_TRIES 3 TCP keepalive retry attempts.
WEBCLIENT_TCP_KEEP_IDLE_IN_SECONDS 30 TCP keepalive idle time (seconds).
WEBCLIENT_TCP_KEEP_INTERVAL_IN_SECONDS 10 TCP keepalive probe interval (seconds).
WEBCLIENT_TIME_IN_SECONDS_AFTER_RETRY_WITH_FAILURE 1 Wait time after failed retry (seconds).
WRITE_FILE_DIRECTORY Directory for final output files.
WRITE_FILES_LIFETIME_IN_DAYS 30 Lifetime of written files (days).
XML_FILE_MERGER_ROOT_ELEMENT Containers Root element for XML file merging.
ZIP_FILENAME exporter-files-${SITE}-${TIMESTAMP}.zip Pattern for ZIP archive naming.

About Cron Expressions in Spring

Cron expressions configure scheduled tasks and consist of six space-separated fields representing second, minute, hour, day of month, month, and day of week. For example, the default 0 0 2 * * * means “at 2:00 AM every day.” These expressions allow precise scheduling for maintenance tasks such as cleaning files or archiving data.


Exporter-DB

GitHub: https://github.com/samply/exporter-db (If exists; if not, just remove or adjust accordingly)

The Exporter-DB stores queries for execution by the Exporter and tracks multiple executions of the same query, managing versioning and scheduling.


Reporter

GitHub: https://github.com/samply/reporter

The Reporter is a plugin for the Exporter designed for generating complex Excel reports based on customizable templates. It supports various template engines like Groovy and Thymeleaf, making it ideal for producing detailed documents such as the traditional CCP data quality report.


Exporter Templates

An exporter template describes the structure and content of the export output.

Main Elements

  • converter: Defines the export job, specifying output files and data sources.
  • container: Represents a logical grouping of data rows (like a table).
  • attribute: Defines individual data fields/columns extracted from the data source.

Other Elements

  • cql: Contains Clinical Quality Language metadata used to enrich or filter data.
  • fhir-rev-include: Defines FHIR reverse includes to fetch related resources.
  • fhir-package: (To be detailed)
  • fhir-terminology-server: (To be detailed)

Example Snippet

<converter id="ccp" excel-filename="Export-${SITE}-${TIMESTAMP}.xlsx" source-id="blaze-store" >
  <container id="Patient" csv-filename="Patient-${SITE}-${TIMESTAMP}.csv" excel-sheet="Patient" xml-filename="Patient-${SITE}-${TIMESTAMP}.xml" xml-root-element="Patients" xml-element="Patient" json-filename="Patient-${SITE}-${TIMESTAMP}.json" json-key="Patients" >
    <attribute id="Patient-ID" default-name="PatientID" val-fhir-path="Patient.id.value" anonym="Pat" op="EXTRACT_RELATIVE_ID"/>

    <attribute default-name="DKTKIDGlobal" val-fhir-path="Patient.identifier.where(type.coding.code = 'Global').value.value"/>
    <attribute default-name="DKTKIDLokal" val-fhir-path="Patient.identifier.where(type.coding.code = 'Lokal').value.value" />
    <attribute default-name="DateOfBirth" val-fhir-path="Patient.birthDate.value.toString().substring(0, 4) + '-01-01'"/>
    <attribute default-name="Gender" val-fhir-path="Patient.gender.value" />
  </container>

  <container id="Diagnosis" csv-filename="Diagnosis-${SITE}-${TIMESTAMP}.csv" excel-sheet="Diagnosis" xml-filename="Diagnosis-${SITE}-${TIMESTAMP}.xml" xml-root-element="Diagnoses" xml-element="Diagnosis" json-filename="Diagnosis-${SITE}-${TIMESTAMP}.json" json-key="Diagnoses">
    <attribute id="Diagnosis-ID" default-name="DiagnosisID" val-fhir-path="Condition.id.value" anonym="Dia" op="EXTRACT_RELATIVE_ID"/>
    <attribute id="Patient-ID" link="Patient.Patient-ID" default-name="PatientID" val-fhir-path="Condition.subject.reference.value" anonym="Pat"/>

    <attribute default-name="ICD10Code" val-fhir-path="Condition.code.coding.code.value"/>
    <attribute default-name="ICDOTopographyCode" val-fhir-path="Condition.bodySite.coding.where(system = 'urn:oid:2.16.840.1.113883.6.43.1').code.value"/>
    <attribute default-name="LocalizationSide" val-fhir-path="Condition.bodySite.coding.where(system = 'http://dktk.dkfz.de/fhir/onco/core/CodeSystem/SeitenlokalisationCS').code.value"/>
  </container>

  <container id="Histology" csv-filename="Histology-${SITE}-${TIMESTAMP}.csv" excel-sheet="Histology" xml-filename="Histology-${SITE}-${TIMESTAMP}.xml" xml-root-element="Histologies" xml-element="Histology" json-filename="Histology-${SITE}-${TIMESTAMP}.json" json-key="Histologies" >
    <attribute id="Histology-ID" default-name="HistologyID" val-fhir-path="Observation.where(code.coding.code = '59847-4').id" anonym="His" op="EXTRACT_RELATIVE_ID"/>
    <attribute id="Diagnosis-ID" link="Diagnosis.Diagnosis-ID" default-name="DiagnosisID" val-fhir-path="Observation.where(code.coding.code = '59847-4').focus.reference.value" anonym="Dia"/>
    <attribute id="Patient-ID" link="Patient.Patient-ID" default-name="PatientID" val-fhir-path="Observation.where(code.coding.code = '59847-4').subject.reference.value" anonym="Pat" />

    <attribute default-name="ICDOMorphologyCode" val-fhir-path="Observation.where(code.coding.code = '59847-4').value.coding.code.value"/>
    <attribute default-name="Grading" val-fhir-path="Observation.where(code.coding.code = '59542-1').value.coding.code.value" join-fhir-path="Observation.where(code.coding.code = '59847-4').hasMember.reference.value"/>
  </container>

  <container id="Radiation-Therapy" csv-filename="RadiationTherapy-${SITE}-${TIMESTAMP}.csv" excel-sheet="RadiationTherapy" xml-filename="RadiationTherapy-${SITE}-${TIMESTAMP}.xml" xml-root-element="Radiation-Therapies" xml-element="Radiation-Therapy" json-filename="RadiationTherapy-${SITE}-${TIMESTAMP}.json" json-key="Radiation Therapies">
    <attribute id="Radiation-Therapy-ID" default-name="RadiationTherapyID" val-fhir-path="Procedure.where(category.coding.code = 'ST').id" anonym="Rad" op="EXTRACT_RELATIVE_ID"/>
    <attribute id="Diagnosis-ID" link="Diagnosis.Diagnosis-ID" default-name="DiagnosisID" val-fhir-path="Procedure.where(category.coding.code = 'ST').reasonReference.reference.value" anonym="Dia"/>
    <attribute id="Patient-ID" link="Patient.Patient-ID" default-name="PatientID" val-fhir-path="Procedure.where(category.coding.code = 'ST').subject.reference.value" anonym="Pat" />

    <attribute default-name="RadiationTherapyRelationToSurgery" val-fhir-path="Procedure.extension('http://dktk.dkfz.de/fhir/StructureDefinition/onco-core-Extension-StellungZurOp').value.coding.code.value"/>
    <attribute default-name="RadiationTherapyIntention" val-fhir-path="Procedure.extension('http://dktk.dkfz.de/fhir/StructureDefinition/onco-core-Extension-SYSTIntention').value.coding.code.value" />
    <attribute default-name="RadiationTherapyStart" val-fhir-path="Procedure.where(category.coding.code = 'ST').performed.start.value"/>
    <attribute default-name="RadiationTherapyEnd" val-fhir-path="Procedure.where(category.coding.code = 'ST').performed.end.value"/>
    <attribute default-name="Nebenwirkung Grad" val-fhir-path="AdverseEvent.severity.coding.code.value" join-fhir-path="/AdverseEvent.suspectEntity.instance.reference.where(value.startsWith('Procedure')).value" />
  </container>


  <cql>
    <default-fhir-search-query>Patient</default-fhir-search-query>

    <token key="DKTK_STRAT_MEDICATION_STRATIFIER" value="define MedicationStatement:&#10;if InInitialPopulation then [MedicationStatement] else {} as List &lt;MedicationStatement&gt; &#10;" />
    <token key="DKTK_STRAT_PRIMARY_DIAGNOSIS_NO_SORT_STRATIFIER" value="define PrimaryDiagnosis:&#10;First(&#10;from [Condition] C&#10;where C.extension.where(url=&apos;http://hl7.org/fhir/StructureDefinition/condition-related&apos;).empty()) &#10;" />

    <measure-parameters>
      {
      "resourceType": "Parameters",
      "parameter": [
      {
      "name": "periodStart",
      "valueDate": "2000"
      },
      {
      "name": "periodEnd",
      "valueDate": "2030"
      },
      {
      "name": "reportType",
      "valueCode": "subject-list"
      }
      ]
      }
    </measure-parameters>
  </cql>



  <fhir-rev-include>Observation:patient</fhir-rev-include>
  <fhir-rev-include>Condition:patient</fhir-rev-include>
  <fhir-rev-include>ClinicalImpression:patient</fhir-rev-include>
  <fhir-rev-include>MedicationStatement:patient</fhir-rev-include>
  <fhir-rev-include>Procedure:patient</fhir-rev-include>
  <fhir-rev-include>Specimen:patient</fhir-rev-include>
  <fhir-rev-include>AdverseEvent:subject</fhir-rev-include>
  <fhir-rev-include>CarePlan:patient</fhir-rev-include>

</converter>

1. Converter

Main tag of an exporter template grouping converters to find the best chain for data conversion.

Tag Description
<converter> Main tag for exporter template containing sources, metadata, and additional query information
Attribute Description Example Default
id ID to reference a template id="ccp-opal"
default-name Default name when output is in a single file format (no extension; added automatically)
ignore Deactivate template but keep accessible ignore="true" false
excel-filename Name of the Excel output file (supports variables ${SITE}, ${TIMESTAMP}) excel-filename="Export-${SITE}-${TIMESTAMP}.xlsx"
csv-separator CSV separator character "\t"
source-id ID of the data source source-id="blaze-store"
target-id ID of a target server for file transfer (e.g., Opal for DataSHIELD) target-id="opal"
opal-project Opal-specific: name of project
opal-permission-type Opal permission type (user or group)
opal-permission-subjects Opal permission subjects
opal-permission Opal permission (administrate or use)

Notes:

  • You can use variables such as ${SITE}, ${TIMESTAMP}, and other environment variables within tags.
  • To define environment variables for a specific export, use the HTTP parameter CONTEXT. The value must be a Base64-encoded string containing comma-separated key-value pairs.
  • Example: Plain: KEY1=VALUE1,KEY2=VALUE2 Base64: S0VZMT1WQUxVRTEsS0VZMj1WQUxVRTI=

Allowed child elements:

  • <container>, <cql>, <fhir-rev-include>, <fhir-package>, <fhir-terminology-server>

2. Container

Represents a data table with columns (attributes).

Tag Description
<container> Defines a container/table with attributes (columns)
Attribute Description Example Default
id Container ID to reference
default-name Name of Excel sheet/file (no extension, added automatically)
csv-filename Name of CSV file csv-filename="Diagnosis-${TIMESTAMP}.csv"
json-filename Name of JSON file json-filename="diagnosis-${TIMESTAMP}.json"
xml-filename Name of XML file xml-filename="diagnosis-${TIMESTAMP}.xml"
xml-root-element Root element name in XML xml-root-element="diagnoses"
xml-element Element name for each entry in XML xml-element="diagnosis"
excel-sheet Excel sheet name excel-sheet="diagnosis-${TIMESTAMP}.xlsx"
opal-table Opal table name opal-name="Diagnosis"
opal-entity-type Opal entity type

3. Attribute

Represents a column in a container/table.

Tag Description
<attribute> Defines an attribute/column
Attribute Description Example Default
id Attribute ID id="Patient-ID"
default-name Default name of the attribute (used if no output-specific name provided)
link Reference to an attribute of another container (format: <container-name>.<attribute-id>) link="Patient.Patient-ID"
csv-column Name of the CSV column
excel-column Name of the Excel column
json-key JSON key
xml-element XML element name
opal-value-type Opal-specific value type
opal-script Script to be applied to the field in Opal
primary-key Marks attribute as primary key primary-key="true" false
validation Marks attribute as syntactic validation field (ends with -Validation in DKTK/BBMRI reporter) validation="true" false
val-fhir-path FHIR path to extract value (if source is a FHIR server) val-fhir-path="Patient.gender.value"
join-fhir-path FHIR path for joining secondary resources to main resource join-fhir-path="/AdverseEvent.suspectEntity.instance.reference.where(value.startsWith('Procedure')).value"
condition-value-fhir-path Condition filtering for complex value extraction (FHIR path syntax) condition-value-fhir-path="Patient.birthDate <= today() - 18 'years'"
anonym Anonymization prefix; replaces real value with anonym + number anonym="Pat"
mdr Metadata repository ID in DKTK context mdr="dktk:dataelement:20:3"
op Operation applied on value (e.g., EXTRACT_RELATIVE_ID) op="EXTRACT_RELATIVE_ID"

Notes on join-fhir-path

  • Used to join resources in FHIR queries when container references multiple resources.

  • Two join types:

    • Direct: main resource points to secondary resource.
    • Indirect: secondary resource points back to main resource (path begins with /).
  • Joins can chain multiple resources, e.g., R1 -> R2 -> R3, with commas separating joins.


4. CQL

Contains metadata and details important for handling CQL queries.

Tag Description
<cql> Container for CQL query metadata including tokens and parameters

5. Token (CQL)

Replaces keys in CQL queries with specific values (commonly used for stratifiers).

Tag Description
<token> Contains key and value attributes
Attribute Description Example
key Key to replace in CQL key="DKTK_STRAT_MEDICATION_STRATIFIER"
value CQL code snippet that replaces key value="define MedicationStatement: if InInitialPopulation then [MedicationStatement] else {} as List <MedicationStatement>"

6. Measure Parameters (CQL)

Parameters for a CQL measure query, typically in JSON format.

Tag Description
<measure-parameters> Parameters such as periodStart, periodEnd, reportType

7. Default FHIR Search Query (CQL)

FHIR search query applied after obtaining measure reports from CQL.

Tag Description Example
<default-fhir-search-query> Defines a FHIR resource type to query (e.g., Patient) Patient

8. FHIR Reverse Include

Defines which resources should be reverse-included when using FHIR search as input or CQL_DATA.

Tag Description
<fhir-rev-include> Specifies reverse include resources to simplify FHIR queries