socio4health.utils.harmonizer_utils.s4h_standardize_dict#
- socio4health.utils.harmonizer_utils.s4h_standardize_dict(raw_dict: DataFrame) DataFrame[source]#
Cleans and structures a dictionary-like DataFrame of variables by standardizing text fields, grouping possible answers, and removing duplicates.
- Parameters:
raw_dict (pd.DataFrame) – DataFrame containing the required columns:
question,variable_name,description,value, and optionallysubquestion.- Returns:
A cleaned and grouped DataFrame by
questionandvariable_name, with an additional columnpossible_answerscontaining concatenated descriptions.- Return type: