MLAI
-
K-Nearest Neighbors (KNN)MLAI/Classification 2020. 1. 20. 20:49
1. Overview K nearest neighbors is a simple algorithm that stores all available cases and classifies new cases based on a similarity measure (e.g., distance functions) 2. Description 2.1 Procedure Step 1: Choose the number K of neighbors Step 2: Take the K nearest neighbors of the new data point, according to the Euclidean distance Step 3: Among these K neighbors, count the number of data points..
-
Polynomial Linear RegressionMLAI/Regression 2020. 1. 20. 19:43
1. Overview 2. Description Instead of the linear regression, we're going to conductive Pono regression and that's, in this case, fits perfectly. Why is it still called a linear regression if it's a polynomial regression? When we're talking about linear and nonlinear we're not actually talking about the X variables. You're talking about the coefficients here. 3. Reference
-
Cluster AnalysisMLAI/Regression 2020. 1. 20. 14:12
1. Overview Technically speaking cluster analysis is a multivariate statistical technique. Intuitively speaking observations in a data set can be divided into different groups and sometimes this is very useful. Both results are perfectly logical but in a different way in the first two cases we were differentiating the clusters by geographic proximity while in the second by language geographic pr..
-
Logistic Regression StatisticsMLAI/Regression 2020. 1. 20. 12:54
1. Overview 1.1 Likelihood function It is a function which estimates how likely it is that the model at hand describes the real underlying relationship of the variables. The bigger the likelihood function the higher the probability that our model is correct. MLE tries to maximize the likelihood function. The computer is going through different values until it finds a model for which the likeliho..
-
Ordinary Least Squares AssumptionsMLAI/Regression 2020. 1. 20. 01:05
1. Overview If a regression assumption is violated, performing regression analysis will yield an incorrect result. The linear regression is the simplest non-trivial relationship. 2. Linearity $$\gamma =\beta_{0}+\beta_{1}x_{1}+\beta_{2}x_{2}+\cdots +\beta_{k}x_{k}+\varepsilon $$ How can you verify if the relationship between two variables is linear The easiest way is to choose an independent var..
-
Correlation vs RegressionMLAI/Regression 2020. 1. 19. 14:08
1. Overview Correlation does not imply causation. 2. Description The first correlation measures the degree of relationship between two variables. Regression analysis is about how one variable affects another or what changes it causes to the other. Second, Correlation doesn't capture causality but the degree of interrelation between the two variables. Regression is based on causality. It shows no..
-
Multiple Linear regressionMLAI/Regression 2020. 1. 19. 00:01
1. Overview Multiple linear regression (MLR), also known simply as multiple regression, is a statistical technique that uses several explanatory variables to predict the outcome of a response variable. The goal of multiple linear regression (MLR) is to model the linear relationship between the explanatory (independent) variables and response (dependent) variable. 2. Description 2.1 Formula 2.1.1..
-
Feature ScalingMLAI/Preprocessing 2020. 1. 18. 21:39
1. Issue Let's explain what its features scaling and why we need to do it. So as you can see we have these two columns age and salary that contain numerical numbers. Let's just focus on the age and the salary. You notice that the variables are not on the same scale because the age is going from 27 to 50. And the salaries going from 40K to like 90K. So because this age variable in the salary vari..