
This function creates "flat" tables e.g., like this: This is where the ftable command is useful. For example, what do you see when you do the following?Īs you add dimensions, you get more of these two-sided subtables and it becomes rather easy to lose track. We can also construct tables with more than two sides in R. Make sure you first detach the air quality data set and attach airquality.new. Now, let us construct some simple tables. Read in the file and print out rows 50 to 60 of the new data set airquality.new. The columns good temp and badozone represent days when temperatures were greater than or equal to 80 (good) or not (low) and if the ozone was greater than or equal to 60 (high) or not (low), respectively. Ozone Solar.R Wind Temp Month Day goodtemp badozone For the next few examples we will be using the dataset. The simple "table" command in R can be used to create one-, two- and multi-way tables from categorical data. We now discuss how you can create tables from your data and calculate relative frequencies.

Notation of the type y ~ x can be read as "y described using x". We can draw a boxplot of the number of cancer cases according to each level of alcohol consumption (alcgp): (Look up the R help on this data set to find out more about the variables.) The first 5 rows of the data are shown below: As a further example, let us consider another data set esoph in R, relating to a case-control study of esophageal cancer in France, containing records for 88 age/alcohol/tobacco combinations. We have already looked at examples with histograms and boxplots. With grouped data, it is important to be able not only to create plots for each group but also to compare the plots between groups. If there are any missing values, these can be excluded if we simply adding an extra argument na.rm=T to tapply.Ĭompute the range and mean of Ozone levels for each month, using the tapply command. For example, we might want to compute the mean temperatures in each month: This can be done using the tapply() command. When dealing with grouped data, you will often want to have various summary statistics computed within groups for example, a table of means and standard deviations. Scatterplots in R (R Tutorial 2.6) MarinStatsLectures
#SCATTER PLOT IN R STUDIO FULL#
The full range of point plotting symbols used in R are given by "pch" in the range 1 to 25 see the help on "points" to see what each of these represent. "col" refers to the color of symbols plotted. Repeats the scatterplot, this time with red filled circles that are nicer to look at. > plot(airquality$Temp, airquality$Ozone, col="red", pch =19) The default plotting symbols in R are not always pretty! You can actually change the plotting symbols, or colors to something nicer. Write the following command in R and describe what you see in terms of relationships between the variables. With more than two variables, the pairs() command draws a scatterplot matrix. > plot(airquality$Temp, airquality$Ozone) # How do Ozone and temperature measurements relate? The R command for drawing a scatterplot of two variables is a simple command of the form "plot(x,y)."

We will look at this in more detail later when we discuss regression and correlation. One very commonly used tool in exploratory analysis of multivariate data is the scatterplot.
