UDSv4 and Digital Voice
About Digital Voice
Introduction
Uniform Data Set Version 4 (UDSv4) will give centers the option to collect digital audio recordings of the cognition section of the UDS study. NACC and the Clinical Task Force (CTF) Technology Workgroup are collaborating to support Alzheimer’s Disease Researcher Center (ADRC) adoption of digital voice as part of the UDSv4 rollout.
More information and resources will be provided to ADRCs in the coming months detailing best practices, protocols, and more.
Why digital voice?
The implementation of digitally recording participant responses to neuropsychological tests is a cost-effective way to detect early changes in cognition. As our cognitive capabilities shift, we express them through vocal responses in subtle ways, such as changing word choices or sentence structures because of word finding problems, pausing, hesitating, and shifting as memory, attention, and executive functions are compromised.
Currently, there are no gold standards in methods for analyzing voice recordings in relation to cognition. However, just as with blood-based biomarkers, there is a growing, albeit still limited, set of literature suggesting that analysis of digital voice recordings as a method for differentiating those with and without cognitive impairment is promising.
Benefits of digital voice
- A non-diluting resource: Digital data can be repurposed for different purposes as algorithms and analysis techniques improve.
- Minimal participant burden: NP tests are already being conducted; digital voice collection allows for scientific enablement at no additional burden to participants.
- Low cost and inclusive: Penetration of recording devices allows for easy, low-cost collection of voice data that can be done in the person’s native language.
- Novel analytics: Natural Language Processing (NLP) and other advanced machine-learning methods offer opportunities to explore acoustic and semantic features in novel forms.
- Quality control (QC): Digitally recorded voice tasks can act as a QC tool to determine natural drift in standardization in any longitudinal study.
ADRC Resources
Getting Started with Digital Voice
Additional Resources
Supporting Digital Voice Data Collection - A Collaboration with the Clinical Task Force Technology Workgroup
NACC and the Clinical Task Force (CTF) Technology Workgroup are collaborating to support ADRC adoption of digital data modalities, starting with digital voice as part of the UDSv4 rollout! This workgroup's goal is to expand and enrich AD/ADRD data collection with less burden on participants and clinical staff. It aims to identify and develop guidelines for high impact digital data modalities that will be collected, integrated, and harmonized from across the ADRC Program and shared with researchers via the Data Front Door.
The CTF Technology Workgroup consists of the CTF Technology Workgroup Parent and the following three sub-committees:
New Non-UDS Digital Instruments:
Co-leads: Jeff Kaye MD (OHSU), Kate Papp, PhD (Mass Gen/Harvard), Jason Hassenstab, PhD (Washington University of St. Louis)
This committee will cover a myriad of different assessments and/or technologies, from radar to robots! Focusing on identifying high value non-UDS assessments that can be adopted by ADRC’s with minimal burden for participants and clinical staff and encourage involvement of the diverse sub-populations or cohorts across the ADRC ecosystem.
In-Clinic UDS Digital Instruments:
Co-leads: Teresa Gomez-Isla, MD (Mass Gen/Havard), Kate Possin, PhD (UCSF), Hiroko Dodge, PhD (Mass Gen/Harvard)
This committee will guide and support the implementation of audio recording of traditional paper-based UDS measures. We will review and propose in-clinic digital tests that acquire clinically meaningful data with minimal participant and staff burden for incorporation into the UDS.
Virtual Standard UDS:
Co-leads: Sudeshna Das, PhD (Mass Gen/Harvard), Zach Beattie, PhD (OHSU), Melissa Lamar, PhD (Rush ADRC)
This committee will establish guidelines and best practices for conducting virtual Unified Data Set (UDS) assessments. We will define the minimal virtual UDS dataset, provide recommendations for the virtual administration process, evaluate the equivalency of virtual and in-person evaluations, and address logistical considerations.
Events and Training
Webinar: UDS 4.0 Digital Voice Training Workshop
NACC and the CTF Technology Workgroup hosted a training workshop focused on digital voice data collection. This workshop provided guidance on the consent process, recording procedures, and data storage for digital voice data. Leaders in the digital voice field presented research findings emphasizing the scientific importance and potential of digital voice data!
View slides hereAgenda
- 10min - Introduction
- 30min - Importance of Digital Voice + Q&A
- 20min - Consent & IRB + Q&A
- 25min - Digital Voice Data Collection + Q&A
- 25min - Flexibility in Data Collection Protocol Adherence + Q&A
- 10min - Closing
Our Speakers
Rhoda Au, PhD, MBA
Professor of Anatomy and Neurobiology, Director of Neuropsychology for the Framingham Heart Study
Brad Dickerson, MD
Professor of Neurology, Leader of the Neuroimaging Core
Sudeshna Das, PhD
Director of MGH Biomedical Informatics Core, Neurology
Jeffrey Kaye, PhD
Director of ORCATECH and Professor of Neurology at OHSU School of Medicine
Cody Karjadi, MS
Research Applications Developer Team Manager for the Framingham Heart Study
Melissa Lamar, PhD
Professor and Clinical Neuropsychologist
Nina Silverberg, PhD
ADRC Program Director
FAQs
- What are the benefits of digital biomarkers to my ADRC?
There are many practical benefits to adding voice recording to your ADRC, they include:
- Acting as a QC tool to determine natural drift in standardization in any longitudinal study
- Providing an easy, low-cost collection of data that can be done in a participant’s preferred language
- Allowing additional scientific enablement at no additional participant burden
- Increasing opportunities to explore acoustic and semantic features in novel forms
- What are the benefits of digital biomarkers to participants?
While direct participant benefit may be low at the outset, over time features of voice data may be able to:
- Provide early indicators of cognitive impairment in preclinical or prodromal Alzheimer’s dementia
- Help track disease progression and predict conversion to dementia
- What are the benefits of digital biomarkers to science?
Features of voice data are already showing promise in:
- Serving as a dementia screening tool to detect those at risk for dementia
- Indicating the effectiveness of clinical trials
- Associating with CSF biomarkers of disease
- Can I use any recorder to capture voice data for the UDSv4?
While many devices will record voice data, NACC has certain requirements for recording UDSv4 voice data:
- Zoom H4N recorders are preferred
- Limited background noise
- At the beginning of the recording state staff ID, participant ID, study visit date and number, and what is being recorded
- Not all recorders are encrypted which may pose a significant risk should it be lost/misplaced if it contains PHI. Does this concern mean that only the more expensive, encrypted recorders should be used?
This will be up to your local institution and ADRC. Your institution may want to adopt best practices such as;
- Immediately upon the visit/recording&aposs termination, upload the file to a secure cloud server and delete it from the recording device.
- If you are capturing digital voice on an iPhone there are ways to encrypt files using that device.
There is no ‘one size fits all’ approach; however, each method will have its own issues and concerns with relation to privacy and security.
- Do I need to process the voice data at my ADRC?
While all ADRCs are encouraged to keep voice files for their own use, all voice data will be uploaded to NACC
- Use NACC naming conventions:
NACCID_DATE_TESTNAME
- Store in
WAV
format - Enter meta-data of the test in UDSv4 dVoice form (under development)
- Use NACC naming conventions:
- What is UDSv4 dVoice form?
This is a form that is under development that will accompany any voice data uploads; it includes but is not limited to the following:
- NACC ID
- Tester ID
- Recording Date
- Name of file with full path
- Recording device details
- QC details
- How do you separate multiple speakers’ voices to focus on the participant’s voice?
Sometimes it might be good to analyze ‘conversations’ however, automated methods are increasingly available and efficient in ‘cleaning’ vocal recordings down to a single speaker of interest.
- What is the process for scrubbing an audio file to remove PHI? Who is handling that task?
De-identification tools are in process for voice masking (e.g., Whisper) and PII splicing; however, NACC will not share any vocal data until such de-identification processes are validated and complete on all vocal recordings. Given there are multiple levels of de-identification, exploration of all levels is underway. The most important is to de-identify the voice prints themselves; estimated voice masking technology deliverables may be as early as 2025 with at least some level of PHI removal.
- Is there a standard file type (WAV vs MP3) and bit-rate, bit depth minimums?
Google has speech to text tool with their own recommendations which can be found here.
Specific recommendations and details will come from the work group in the future, however, some general information is outlined below:
- Generally speaking, at least 16KHz WAV PCM16le, 16bit depth with 256kb/s bit rate would be roughly the minimum. We at FHS record at a high level of quality, which may likely not be the necessary minimum (48KHz, WAV PCM24le, ~2000kb/s, 24 bit depth stereo, results in about 1 GB per 1 hour)
- Depending upon the platform available to you, there are compromises that will need to be made (e.g., Zoom does not save to .wav formats), however, it should not prohibit you from collecting voice recordings should you wish.
- Will there be measures taken to protect the neuropsychological recordings? Neuropsychologists are advised to prevent the release of stimulus materials, test procedures, etc. of neuropsychological data. Having these things recorded allows for the potential release of this information. In addition to the NACC Battery, some sites may include copyright protected measures in their site-specific battery. What steps can be taken to protect the collection of that specific copyright materials? Can we opt out of recording those measures?
Just like you get a clinical read of a scan, you can get a clinical read of vocal recordings and thus, only share agreed upon metrics and that is what is recorded and shared. Thus, the data files may not need to be shared at the raw form level.