socio4health.utils.harmonizer_utils.standardize_dict#
- socio4health.utils.harmonizer_utils.standardize_dict(raw_dict: DataFrame) DataFrame [source]#
Cleans and structures a dictionary-like DataFrame of variables by standardizing text fields, grouping possible answers, and removing duplicates.
- Parameters:
raw_dict (pd.DataFrame) – DataFrame containing the required columns:
question
,variable_name
,description
,value
, and optionallysubquestion
.- Returns:
A cleaned and grouped DataFrame by
question
andvariable_name
, with an additional columnpossible_answers
containing concatenated descriptions.- Return type: