Study Design Contd.
Steps for linking with secondary data set
- An algorithm is used to generate a unique client identifier from, SSN, DOB, first initial of the first name, and first initial of the last name. This is done for each dataset. Gender and race are used as tie breakers in case of duplicates.
- If the study records begin with January 1, 2002, data from the secondary state agencies is collected for two years prior (January 1999) and two years post (December 2004).