Home > Statistics Every Writer Should Know > The Stats Board > Discusssion
recoding data for regression
Message posted by boulton (via 142.106.180.134) on December 7, 2001 at 11:55 AM (ET)
thanks for the previous suggestions...
what would be the effects of performing a regression where I recoded the largest values of the predictors into one value ( so anything over 1000 becomes 1001). i basically have this spectrum of a continuous variable and a categorical one (1001), how does that influence the final significance?
READERS RESPOND:
(In chronological order. Most recent at the bottom.)
Re: recoding data for regression
Message posted by Tomi (via 154.32.143.18) on December 8, 2001 at 4:12 AM (ET)
Sounds like a horrible thing to do... why are you doing this?If you are doing this because the large values make things awkward to plot, for example, then you could go for ranking the data and using spearman's rank correlation...
Re: recoding data for regression
Message posted by boulton (via 142.106.180.134) on December 10, 2001 at 2:34 PM (ET)
I recoded prior to regressions since the accuracy of 1000+ data could not be verified....so exactly how is that so horrible?
Re: recoding data for regression
Message posted by Phil (via 66.32.21.189) on December 11, 2001 at 2:54 AM (ET)
If you code the data like I think you are suggesting, it could drastically change the results (regression coefficients and correlation coefficient) and the nice interpretations and properties of the least squares curve fit procedure will not apply.If you have data that is unverifiable and outside the region of interest, you could consider leaving it out.
Your $5 contribution helps cover part the $500 annual cost of keeping this site online.