Total Least Squares in comparison with OLS and ODR

To be precise, if we have a bunch of data collected in the past(which is independent variable) and also corresponding outcomes(which is dependent variable), we can make the machine that predict future outcomes with our new data that we just collected.To make a better machine, we use regression analysis and try to get better parameters, which is a slope and a constant for the linear model..For instance, take a look at the figure below..We seek parameters of the red regression line(and blue points are independent variable, and the length of grey line is the amount of residuals of this estimator).A visual comparison between OLS and TLSIn OSL, the gray line isn’t orthogonal..This is the main difference between OSL and TLS(and ODR)..The gray line is parallel to y-axis, on the other hand, in TLS, the gray line is orthogonal toward regression line..The objective function (loss function) in OLS is defined as:Which is solved by a quadratic minimization..And we can get parameter vectors from that(this is all what we need).Numpy provides numpy.linalg.lstsq for this though, it’s easy to implement from scratch..We get parameter vectors in b in code below and use it to estimate fitted values..numpy.linalg.lstsq expects the constant c exists at the last index, so we need to switch the position.Let’s back to TLS and consider the reason TLS is preferred over OLS..OLS expects that the all sample data is measured exactly, or observed without error..But, in real case there are more or less observational errors, if so, OLS could be an inconsistent estimator..TLS can take the problem into consideration, and it allows that there are errors in both independent and dependent variables.. More details

Leave a Reply