Expression data from TGCA: graphing expression data

R programming language resources Forums Graphing Expression data from TGCA: graphing expression data

Viewing 3 posts - 1 through 3 (of 3 total)
  • Author
    Posts
  • #338
    baumeist
    Member

    Hi,

    I am new to R.
    I am trying to figure out how to graph expression data from the TCGA database.
    If I understand correctly the expression data I have downloaded is from a microarray using the AgilentG4502A.

    I’ve had trouble reading into R in the level I, level II, and the gene expression analysis data using

    >dat<-read.table("C:\\file.txt", header=T, row.names=1) for example: > dat1<-read.table("C:\\US82800149_251976011000_S01_GE2_105_Dec08.txt", header=T, row.names=1) > dat<-read.table("C:\\unc.edu__AgilentG4502A_07_3__TCGA-A6-2674-01A-02R-0821-07__gene_expression_analysis.txt", header=T, row.names=1) in all cases I get the error "more columns than column names" I have only been able to read in the level II data with the code: > dat2<-read.table("C:\\US82800149_251976011000_S01_GE2_105_Dec08.txt_lmean.out.logratio.probe.tcga_level2.data.txt",header = TRUE, as.is = TRUE, sep="\t") So this is what I am working with. I can see that the dimensions of this data are > dim(dat2)
    [1] 90798 2

    When I print “dat2” to the screen it looks like this:
    I assume that this is one patient with expression (intensity) data for a large number of genes, but don’t know.

    49995 A_23_P67323 -0.427
    49996 A_23_P67330 -0.3275
    49997 A_23_P67332 -0.409
    49998 A_23_P67339 3.2955
    49999 A_23_P67355 1.205

    If I try to plot the data with the following below

    > names(dat2)
    [1] “Hybridization.REF” “TCGA.A6.2674.01A.02R.0821.07”

    > x<-c("Hybridization.REF") > y<-("TCGA.A6.2674.01A.02R.0821.07") > plot(x,y,type=’p’,xlab=’Hybridization.REF’,ylab=’TCGA.A6.2674.01A.02R.0821.07′,main=’plot’)

    I get the error:

    Error in plot.window(…) : need finite ‘xlim’ values
    In addition: Warning messages:
    1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
    2: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
    3: In min(x) : no non-missing arguments to min; returning Inf
    4: In max(x) : no non-missing arguments to max; returning -Inf
    5: In min(x) : no non-missing arguments to min; returning Inf
    6: In max(x) : no non-missing arguments to max; returning -Inf
    >

    I am really not sure how to plot this data, partly because I’m not sure what the level II data represents.

    Can anyone tell me what the level II data represents and what type of plotting functions I might use?

    Thanks in advance,
    MAB

    #363
    baumeist
    Member

    Thanks a lot.
    I check it out.
    Mark

    #361
    bryan
    Participant

    I’m not familiar with the TCGA database, but there is an extensive library on it here that may help:

    https://wiki.nci.nih.gov/display/TCGA/Sample+and+Data+Relationship+Format

    In the import that *is* working, I see that you’ve specified specified a tab delimiter, whereas in the ones that aren’t working you haven’t identified a delimiter. If you don’t identify a delimiter for read.table, it will use the default of “any white space”, which may not be appropriate. I would confirm what delimiter is needed and specify it directly.

Viewing 3 posts - 1 through 3 (of 3 total)
  • You must be logged in to reply to this topic.
Scroll to top
Privacy Policy