using predict in R - ProgrammingR

Tagged: predict, R

This topic has 1 reply, 2 voices, and was last updated 10 years, 6 months ago by Johnvereen.

Viewing 1 post (of 1 total)

Author
Posts
April 19, 2013 at 3:03 am #908
ArchieIndian
Member
I am reading through predict() in R and am confused:
There is a dataset Spam from which we have created a train data and test data using random sampling. We have used the trainSpam(training data set to train the system). We want to see how good the model is, by testing on the test dataset(testSpam).
predictionModel = glm(numType ~ charDollar, family ="binomial", data = trainSpam)
predictionTest = predict(predictionModel, testSpam) predictedSpam = rep("nonspam", dim(testSpam)[1]) predictedSpam[predictionModel$fitted >0.5]="spam"#Here is my problem table(predictedSpam, testSpam$type)
In the line where we say:
predictedSpam[predictionModel$fitted >0.5]="spam"
How does predictionModel$fitted predict spams in the test data. It seems to be using predictionModel$fitted from the training data. Then we go on to compare with the spams of test data. Can someone explain?
Here is what I understood. In the line:
predictionModel = glm(numType ~ charDollar, family = “binomial”, data = trainSpam)
We create a model using the trainSpam data.
In the next line:
predictionTest = predict(predictionModel, testSpam)
We create predictionTest using the same model but the test data.
In the next line:
predictedSpam = rep(“nonspam”, dim(testSpam)[1])
We created a vector with all values “nonspam”
In the next line:
predictedSpam[predictionModel$fitted > 0.5] = “spam”
We are using the predictionModel$fitted, which has been fitted over the training data to decide which of the rows are to be classified as spam. Shouldn’t we rather use something like predictionTest to identify the spams?
This is where I am reading from: https://github.com/jtleek/dataanalysis/blob/master/week2/002structureOfADataAnalysis2/structureOfADataAnalysis2.pdf
Author
Posts

Viewing 1 post (of 1 total)

You must be logged in to reply to this topic.

One thought on “using predict in R”