Medinym

AI-based Anonymization of Personal Patient Data in Clinical Text and Voice Databases
(BMFTR 2022-2025)

Motivation The ongoing scientific development of technologies based on artificial intelligence (AI) is promoting potential medical applications. The real use of these technologies by a large number of users such as citizens, public authorities, healthcare professionals and small and medium-sized enterprises faces the difficulty of handling data in a secure and data-protected manner. Innovative technologies often cannot be used in the automated processing of medical data in particular, as the protection of identity is rightly a high priority due to the sensitive content. The need to protect clinical data and the resulting difficulty in accessing it also means that machine learning (ML) methods, for example for clinical diagnoses, prognoses and therapy or decision support, cannot be developed without major hurdles.

Aims and approach The project “AI-based anonymization of personal patient data in clinical text and speech datasets” (Medinym) investigates the possibility of reusing sensitive data by removing sensitive information through anonymization. Two medical use cases, text-based data from electronic patient records and voice data from diagnostic doctor-patient consultations, are being implemented as examples in the project. To this end, open technologies for anonymization are being investigated, further developed and applied to real data. The researchers are also investigating how the informative value of such anonymized data can be preserved for further use. Methods that prevent or hinder misuse of the technology outside of the intended use case will also be considered.

Innovations and perspectives Information-preserving anonymization should make it possible to further process clinical data, as de-anonymization is no longer possible. These data sets can then be used to train AI models on clinical data in compliance with data protection regulations or be extended to other cohorts. This would make it possible for small and medium-sized companies to collect corresponding amounts of data cumulatively. This would allow sensitive data to be pooled across multiple applications and used for AI training routines, provided it is always anonymized accordingly. The desired anonymization should also increase the willingness of patients to consent to participation in studies, data analyses and general donations of health data. Ultimately, information-preserving anonymization allows the technology to be integrated into current development methods and diagnostic systems, thereby strengthening Germany as a location for science and business in the fields of diagnostics, treatment and therefore healthcare in general.

Offical project brief

Project Partners

References

2025

  1. StutterCut: Uncertainty-Guided Normalised Cut for Dysfluency Segmentation
    Suhita Ghosh, Melanie Jouaiti, Jan-Ole Perschewski, and Sebastian Stober
    In Interspeech 2025, Aug 2025
  2. Investigating Inclusivity of Whisper for Dysfluent Speech
    Evelyn Starzew, Suhita Ghosh, and Valerie Krug
    In 12th edition of the Disfluency in Spontaneous Speech Workshop (DiSS 2025), Sep 2025

2024

  1. Anonymising Elderly and Pathological Speech: Voice Conversion Using DDSP and Query-by-Example
    Suhita Ghosh, Melanie Jouaiti, Arnab Das, Yamini Sinha, Tim Polzehl, Ingo Siegert, and Sebastian Stober
    In Interspeech 2024, Sep 2024
  2. Improving Voice Quality in Speech Anonymization With Just Perception-Informed Losses
    Suhita Ghosh, Tim Thiele, Frederic Lorbeer, and Sebastian Stober
    In Audio Imagination: NeurIPS 2024 Workshop AI-Driven Speech, Music, and Sound Generation, 2024
  3. T-DVAE: A Transformer-Based Dynamical Variational Autoencoder for Speech
    Jan-Ole Perschewski and Sebastian Stober
    In Artificial Neural Networks and Machine Learning – ICANN 2024, 2024

2023

  1. Improving voice conversion for dissimilar speakers using perceptual losses
    Suhita Ghosh, Yamini Sinha, Ingo Siegert, and Sebastian Stober
    In 49. Jahrestagung für Akustik DAGA 2023, Hamburg, Mar 2023
  2. Anonymization of Stuttered Speech – Removing Speaker Information while Preserving the Utterance
    Jan Hintz, Sebastian Bayerl, Yamini Sinha, Suhita Ghosh, Martha Schubert, Sebastian Stober, Korbinian Riedhammer, and Ingo Siegert
    In 3rd Symposium on Security and Privacy in Speech Communication, Aug 2023
  3. StarGAN-VC++: Towards Emotion Preserving Voice Conversion Using Deep Embeddings
    Arnab Das, Suhita Ghosh, Tim Polzehl, Ingo Siegert, and Sebastian Stober
    In 12th ISCA Speech Synthesis Workshop (SSW2023), Aug 2023
  4. Emo-StarGAN: A Semi-Supervised Any-to-Many Non-Parallel Emotion-Preserving Voice Conversion
    Suhita Ghosh, Arnab Das, Yamini Sinha, Ingo Siegert, Tim Polzehl, and Sebastian Stober
    In INTERSPEECH 2023, Aug 2023

2022

  1. Voice Privacy - leveraging multi-scale blocks with ECAPA-TDNN SE-Res2NeXt extension for speaker anonymization
    Razieh Khamsehashari, Yamini Sinha, Jan Hintz, Suhita Ghosh, Tim Polzehl, Clarlos Franzreb, Sebastian Stober, and Ingo Siegert
    In 2nd Symposium on Security and Privacy in Speech Communication, Sep 2022