-
Missing DataMLAI/Preprocessing 2020. 1. 18. 18:36
1. Overview
to start preparing the data so that our machine learning models run correctly and the first problem that we have to deal with is the case where you have some missing data in your data set and that happens quite a lot actually in real life.
2. Description
2.1 Handling Missing Data
2.1.1 Deletion
to remove this line and remove this line but that can be quite dangerous because imagine this data set contains crucial information. It would be quite dangerous to remove an observation.
2.1.2 Take the mean
The most common idea to handle missing data is to take the mean of the columns. So here we are going to replace this missing data here by the mean of all the values in the column age and that's the same for every feature that contains missing data. We replace this missing data by the mean of the values in the column that contains this missing data.
2.1.3 Median
2.1.4 Most frequent
3. Reference
'MLAI > Preprocessing' 카테고리의 다른 글
Feature Scaling (0) 2020.01.18 Categorical Data (0) 2020.01.18 Term frequency–inverse document frequency(TF-IDF) (0) 2019.09.25