R programming language resources › Forums › Graphing › Expression data from TGCA: graphing expression data
- This topic has 2 replies, 2 voices, and was last updated 13 years, 7 months ago by
baumeist.
- AuthorPosts
- September 15, 2011 at 4:39 pm #338
baumeist
MemberHi,
I am new to R.
I am trying to figure out how to graph expression data from the TCGA database.
If I understand correctly the expression data I have downloaded is from a microarray using the AgilentG4502A.I’ve had trouble reading into R in the level I, level II, and the gene expression analysis data using
>dat<-read.table("C:\\file.txt", header=T, row.names=1) for example: > dat1<-read.table("C:\\US82800149_251976011000_S01_GE2_105_Dec08.txt", header=T, row.names=1) > dat<-read.table("C:\\unc.edu__AgilentG4502A_07_3__TCGA-A6-2674-01A-02R-0821-07__gene_expression_analysis.txt", header=T, row.names=1) in all cases I get the error "more columns than column names" I have only been able to read in the level II data with the code: > dat2<-read.table("C:\\US82800149_251976011000_S01_GE2_105_Dec08.txt_lmean.out.logratio.probe.tcga_level2.data.txt",header = TRUE, as.is = TRUE, sep="\t") So this is what I am working with. I can see that the dimensions of this data are > dim(dat2)
[1] 90798 2When I print “dat2” to the screen it looks like this:
I assume that this is one patient with expression (intensity) data for a large number of genes, but don’t know.49995 A_23_P67323 -0.427
49996 A_23_P67330 -0.3275
49997 A_23_P67332 -0.409
49998 A_23_P67339 3.2955
49999 A_23_P67355 1.205If I try to plot the data with the following below
> names(dat2)
[1] “Hybridization.REF” “TCGA.A6.2674.01A.02R.0821.07”> x<-c("Hybridization.REF") > y<-("TCGA.A6.2674.01A.02R.0821.07") > plot(x,y,type=’p’,xlab=’Hybridization.REF’,ylab=’TCGA.A6.2674.01A.02R.0821.07′,main=’plot’)
I get the error:
Error in plot.window(…) : need finite ‘xlim’ values
In addition: Warning messages:
1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
2: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
3: In min(x) : no non-missing arguments to min; returning Inf
4: In max(x) : no non-missing arguments to max; returning -Inf
5: In min(x) : no non-missing arguments to min; returning Inf
6: In max(x) : no non-missing arguments to max; returning -Inf
>I am really not sure how to plot this data, partly because I’m not sure what the level II data represents.
Can anyone tell me what the level II data represents and what type of plotting functions I might use?
Thanks in advance,
MABSeptember 15, 2011 at 5:15 pm #363baumeist
MemberThanks a lot.
I check it out.
MarkOctober 7, 2011 at 5:14 pm #361bryan
ParticipantI’m not familiar with the TCGA database, but there is an extensive library on it here that may help:
https://wiki.nci.nih.gov/display/TCGA/Sample+and+Data+Relationship+Format
In the import that *is* working, I see that you’ve specified specified a tab delimiter, whereas in the ones that aren’t working you haven’t identified a delimiter. If you don’t identify a delimiter for read.table, it will use the default of “any white space”, which may not be appropriate. I would confirm what delimiter is needed and specify it directly.
- AuthorPosts
- You must be logged in to reply to this topic.