SD1117/FDAC212 Duplicate records

Sergiy · July 9, 2015, 7:32pm

Hi Wenchi,

Current version of SD1117 check produces many false-positive messages. In particular it was not designed to handle Oncology domains.

We are improving the algorithm. Meanwhile, you need to

explain false-positive messaged in Reviewer Guide
clearly specify Domain Key Variables and describe them in define file. Note, that --SEQ cannot be Key Variable in subject data domains
run your own additional validation

Kind Regards,

Sergiy

wanchi · July 14, 2015, 9:46am

Dear Sergiy,

Thanks for your suggestion.

We will explain it in Reviewer Guide, and updated the Key Variables in define.xml file.

Best regards,

Wanchi

XML4Pharma · July 16, 2015, 7:31am

In my personal opinion, software validators should read the key varaibles from the define.xml file and deciide on whether a record is duplicate or not based on the key variables from the define.xml.

Essentially, define.xml contains the metadata of submission files and thus is leading.

Sergiy · July 16, 2015, 3:12pm

Yes, it’s correct.

We are changing a validation algorithm for our dupliate checks. A usage of define.xml as a source of study specific metadata is a part of it.

Kind Regards,

Sergiy

Dmitry.Kolosov · July 16, 2015, 4:47pm

Hi Sergey, XML4Pharma, I understand that a SDTM validation check is discussed here. Just want to note that in case a similar approach will be applied to ADaM, in ADaM there is no strict requirement that key variables uniquely identify a record in a dataset:
ADaM 2.1 page 14:

KEY VARIABLES OF DATASET

A list of variable names that parallels the structure, ideally uniquely identifies and indexes each record in the dataset.

wanchi · June 10, 2015, 8:35am

Dears,

I have a question while creating and vallidating SDTM TU and TR domain.

For example, I have screening tumor data for three target lesions at liver site SEGMENT 3, SEGMENT 4 and SEGMENT 7. Also, the corresponding three Longest Diameters are recorded.

Thereafore, I should create three rows on TU and TR domains for liver site SEGMENT 3, SEGMENT 4 and SEGMENT 7.

For TR domain, the TRTEST=‘Longest Diameter’/TRTESTCD=‘LDIAM’ for TRLNKID ‘T1’, ‘T2’, ‘T3’;

For TU domain, the TUTEST=‘Tumor Identification’/TUTESTCD=‘TUMIDENT’ for TULNKID ‘T1’, ‘T2’, ‘T3’

(TULOC=‘Liver’, and TUPORTOT=‘SEGMENT 3’, ‘SEGMENT 4’ and ‘SEGMENT 7’);

But the validation message indicated that ‘Duplicate records’ since it should be one records per Finding Result per subject. No Finding Result with the same Test Short Name (–TESTCD) for the same Subject (USUBJID) and the same Collection Date (–DTC) are expected.

However, I think the presentation is correct, and the --LNKID is provided for distinguishing.

May I ask how to resolve this issue? Any feedback or any references on this topic would be appreciated.

Topic		Replies	Views
SD1117 - duplicate records SDTM	6	122	October 10, 2013
Understanding the Duplicate Records Validation Rules Blogs	0	54	July 8, 2020
Duplicate records in SDTM LB Define.xml	0	44	August 23, 2019
FDAC213-Duplicate records in CE domain SDTM	8	107	April 24, 2019
Did Pinnacle 21E Check the duplicate Records on Sponsor Keys? General Discussion	0	147	March 13, 2025

SD1117/FDAC212 Duplicate records

Related topics