socio4health.utils.extractor_utils.s4h_parse_fwf_dict#

socio4health.utils.extractor_utils.s4h_parse_fwf_dict(dict_df)[source]#

Parse a dictionary DataFrame to extract column names and fixed-width format specifications.

Parameters:

dict_df (pandas.DataFrame) – A DataFrame containing the dictionary information with columns: - ‘variable_name’: Column names - ‘initial_position’: Starting position (1-based) of each column - ‘size’: Width of each column or ‘final_position’: Ending position of each column

Returns:

A tuple containing: - A list of column names. - A list of tuples representing column specifications (start, end) where:

  • start is 0-based starting position

  • end is 0-based ending position (exclusive)

Return type:

tuple

Raises:

ValueError – If no column names or sizes are found in the dictionary DataFrame.