Normal quantile plot

R programming language resources Forums Graphing Normal quantile plot

Viewing 3 posts - 1 through 3 (of 3 total)
  • Author
    Posts
  • #348
    gshaase
    Member

    I am trying to reproduce a very common way of plotting data to look for outliers in an otherwise a normally-distributted data set. It is a common feature in Minitab, Jump and other statistical software:

    I wish to plot data on a single Normal Quantile plot, such that the data woudl lie on a straight line if it is normally distributed. The measured values would be on a linear scale on the X axis, while the Y axis will shows the cumulative probability percentile (say, 0.1% tp 99.9%) on a NON-linear scale, such that points fall on a straight line (rather than have an error-function shape) .

    Any idea how to obtain such a Y-axis?

    Also, I wish to plot several data sets on the same quantile plot. qqnorm will not allow me to plot a few sets in different colors/shapes.

    I will appreciate any advice,
    Gaddi

    #369
    JWaddle
    Member

    Hi Gaddi,

    For changing the axes, use xaxt = “n” or yaxt = “n” in the initial qqnorm call, then construct the axis manually with axis().

    I created a quick function addQQpoints() for plotting additional sets of data to an existing plot (this function is just a slight altering of the qqnorm() function).

    Lastly, for the colors, I used a color palette I created that allows for easy semi-transparent colors, so that you can see the data when there are overlaps. This is of course optional.

    Let me know if this is what you are looking for:

    #install.packages("oaColors", repos = "http://repos.openanalytics.eu", type = "source")
    addQQpoints <- function (y, ylim, main = "Normal Q-Q Plot", xlab = "Theoretical Quantiles", ylab = "Sample Quantiles", plot.it = TRUE, datax = FALSE, ...) { if (has.na <- any(ina <- is.na(y))) { yN <- y y <- y[!ina] } if (0 == (n <- length(y))) stop("y is empty or has only NAs") if (plot.it && missing(ylim)) ylim <- range(y) x <- qnorm(ppoints(n))[order(order(y))] if (has.na) { y <- x x <- yN x[!ina] <- y y <- yN } if (plot.it) if (datax) plot(y, x, ...) else points(x, y, ...) invisible(if (datax) list(x = y, y = x) else list(x = x, y = y)) } pdf("exampleQQ.pdf", width = 8, height = 8) x1 <- rnorm(100) x2 <- rnorm(100) x3 <- rnorm(100) qqnorm(x1, pch = 19, col = oaColors("orange", alpha = 0.7), yaxt = "n", ylim = c(-3, 3), ylab = "Sample Percentiles") axis(2, at = seq(-3, 3, by = 1), labels = c("0.15%", "2.5%", "16%", "50%", "66%", "97.5%", "99.85%")) qqline(x1) addQQpoints(x2, pch = 19, col = oaColors("blue", alpha = 0.7)) addQQpoints(x3, pch = 19, col = oaColors("green", alpha = 0.7)) dev.off()

    Example graph (http://imgur.com/xanls, apologies about the poor image quality)

    • This reply was modified 11 years, 8 months ago by bryan.
    • This reply was modified 11 years, 8 months ago by bryan.
    • This reply was modified 11 years, 8 months ago by JWaddle.
    • This reply was modified 11 years, 8 months ago by bryan.
    #373
    gshaase
    Member

    JWaddell, thank you very much once again for posting a solution and I apologize for taking so long to get back to this topic.

    The sequence of R commands that you gave work very well.

    However, what I (and many others in the science, engineering, and data-collection communities) require is to have the actual measured values (X1, X2, X3 in your example) be presented on a linear scale on the X axis (not the quantiles).

    When I add datax=TRUE to the first qqnorm comand, and/or in the call or the body of the function addQQpoints, everything goes bad… I played with this for a long time and gave up…

    Can you please help?

Viewing 3 posts - 1 through 3 (of 3 total)
  • You must be logged in to reply to this topic.
Scroll to top
Privacy Policy