socio4health.Harmonizer.drop_nan_columns#
- Harmonizer.drop_nan_columns(ddf_or_ddfs: DataFrame | List[DataFrame]) DataFrame | List[DataFrame] [source]#
Drop columns where the majority of values are
NaN
using instance parameters.- Parameters:
ddf_or_ddfs (dask.dataframe.DataFrame or list of dask.dataframe.DataFrame) –
- Returns:
The DataFrame(s) with columns dropped where the proportion of
NaN
values is greater than nan_threshold.- Return type:
dask.dataframe.DataFrame or list of dask.dataframe.DataFrame
- Raises:
ValueError – If
nan_threshold
is not between 0 and 1, or ifsample_frac
is notNone
or a float between 0 and 1.