socio4health.utils.extractor_utils.s4h_parse_fwf_dict#

socio4health.utils.extractor_utils.s4h_parse_fwf_dict(dict_df)[source]#

Parse a fixed-width format dictionary stored in a pandas DataFrame.

The DataFrame must contain at least the following columns: variable_name and initial_position. Either size or final_position must be present to compute column spans.

Parameters:

dict_df (pandas.DataFrame) – Dictionary table describing fixed-width columns.

Returns:

(colnames, colspecs) where colnames is a list of column names and colspecs is a list of (start, end) integer tuples suitable for use with pandas.read_fwf (0-based, end exclusive).

Return type:

tuple

Raises:

ValueError – If required columns are missing.