Missing Data

MLAI/Preprocessing 2020. 1. 18. 18:36

1. Overview

to start preparing the data so that our machine learning models run correctly and the first problem that we have to deal with is the case where you have some missing data in your data set and that happens quite a lot actually in real life.

2. Description

2.1 Handling Missing Data

2.1.1 Deletion

to remove this line and remove this line but that can be quite dangerous because imagine this data set contains crucial information. It would be quite dangerous to remove an observation.

2.1.2 Take the mean

The most common idea to handle missing data is to take the mean of the columns. So here we are going to replace this missing data here by the mean of all the values in the column age and that's the same for every feature that contains missing data. We replace this missing data by the mean of the values in the column that contains this missing data.

2.1.3 Median

2.1.4 Most frequent

3. Reference

저작자표시

'MLAI > Preprocessing' 카테고리의 다른 글

Feature Scaling (0)	2020.01.18
Categorical Data (0)	2020.01.18
Term frequency–inverse document frequency(TF-IDF) (0)	2019.09.25

ABOUT ME

Demyank's Tlog Demyank's Tlog

1. Overview

2. Description

2.1 Handling Missing Data

2.1.1 Deletion

2.1.2 Take the mean

2.1.3 Median

2.1.4 Most frequent

3. Reference

'MLAI > Preprocessing' 카테고리의 다른 글

티스토리툴바

ABOUT ME

1. Overview

2. Description

2.1 Handling Missing Data

2.1.1 Deletion

2.1.2 Take the mean

2.1.3 Median

2.1.4 Most frequent

3. Reference

'MLAI > Preprocessing' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바