Комментарии / Профиль Ksdfergo3 / Хабр

Пользователь

Автоматизация разведочного анализа данных (EDA) с помощью Python

Ksdfergo3 12 фев 2025 в 14:18

Preproccesing Let's determine the percentage of missing values for further work with them and their processing. df.isnull().mean() * 100 del df['YEAR'] df['COMMENT'].replace({'сумма больше на 1': 'сумма исправлена'}, inplace=True) We get rid of empty values because these attributes have a low percentage of omissions and have little impact on our dataset. df = df[df['GRADE_OF_COMPETITION'].notnull()] df['ID'] = df['ID'].astype(np.int64) df = df[df['REGIONAL_STATUS'] != 'удален']

df[df['REGIONAL_STATUS'] == 'Победитель'][['SUM', 'PERCENTAGE', 'REGIONAL_STATUS']].sort_values(by='SUM', ascending=True).head(10)