COVID-19 data

JHU exercise data

Disclaimer: This exercise is to help people understand data sclicing operations in R. I will not be providing solutions to JHU assignments. However, these exercises will help get closer to finding solutions.

Data is provided in the form of a URL containing a zip file. There are more than one ways to extract data from zip files.

Data extraction - Method 1

  1. Download zip file into working directory.
  2. Decompress the zip file into working directory.
  3. Check data file format.
  4. Load decompressed file into a dataframe.

Disadvantage: Will end up with two files (a zip file and a data file) in the working directory which are unnecessary.

In [1]:
# Download the dataset from the URL to the working directory and name it as "data.zip"
download.file("https://d396qusza40orc.cloudfront.net/rprog/data/quiz1_data.zip","data.zip")
#unzip the zip file to a directory called "data"
unzip("data.zip",exdir="data")
# List file in the "data" directory to check data file format
list.files("data")
'hw1_data.csv'

The data is available in a CSV (comma seperated values) file. We will use read.csv to read data into a dataframe.

In [2]:
# Reading data into a dataframe called "df"
df_data1<-read.csv("data/hw1_data.csv")
# View sample data.
head(df_data1)
Ozone Solar.R Wind Temp Month Day
41 190 7.4 67 5 1
36 118 8.0 72 5 2
12 149 12.6 74 5 3
18 313 11.5 62 5 4
NA NA 14.3 56 5 5
28 NA 14.9 66 5 6

Data extraction - Method 2

Earlier, we ended up with two unnecessary files and a directory in our working directory. Let's try another way to extracting data without downloading anything to our working directory.

  1. Download zip file into kernel/memory.
  2. Check data file format.
  3. Load data file into a dataframe.
In [3]:
# Create an empty file in memory
file<-tempfile()
# Download the zip file to this empty file
download.file("https://d396qusza40orc.cloudfront.net/rprog/data/quiz1_data.zip",file)
# Check data file format
unzip(file,list=TRUE)
Name Length Date
hw1_data.csv 2902 2012-12-26 14:25:00
In [4]:
# Load data file to dataframe
df_data<-read.csv(unz(file,"hw1_data.csv"))
# Delete file from memory
unlink(file)
# View sample data
head(df_data)
Ozone Solar.R Wind Temp Month Day
41 190 7.4 67 5 1
36 118 8.0 72 5 2
12 149 12.6 74 5 3
18 313 11.5 62 5 4
NA NA 14.3 56 5 5
28 NA 14.9 66 5 6

Checking if both dataframes are identical

In [5]:
identical(df_data,df_data1)
TRUE
In [6]:
# Dimensions of the dataframe
dim(df_data)
  1. 153
  2. 6

df_data has 153 rows and 6 columns.
Let's write the same in data science langauge:
The dataframe contains 153 records of 6 variables or features.

In [7]:
# View the first 3 records of te dataframe to get an idea about the data in it.
head(df_data, n=3)
Ozone Solar.R Wind Temp Month Day
41 190 7.4 67 5 1
36 118 8.0 72 5 2
12 149 12.6 74 5 3
In [8]:
# View the last four records.
tail(df_data,n=4)
Ozone Solar.R Wind Temp Month Day
150 NA 145 13.2 77 9 27
151 14 191 14.3 75 9 28
152 18 131 8.0 76 9 29
153 20 223 11.5 68 9 30

We can also find the number of records by finding the length of any one of the variables (example, Ozone)

In [9]:
length(df_data$Ozone)
153

Data slicing (also called subsetting in R) is done using the [ function. The standard syntax is [rows,columns]. If we need to slice by rows, we can omit columns like [rows,]. Do not forget the trailing comma because without it, R assumes that we are slicing by columns by default or slicing a vector.

In [15]:
# Ozone level of 29th row
df_data[29,"Ozone"]
45

If the variables of a dataframe are named, then we can also use their names the following way to achieve the same result as above.

In [17]:
# Ozone level of 29th row
df_data$Ozone[29]
45

How did this work? When we call a variable of a dataframe, it becomes a vector. Here df_data$Ozone is an integer vector. We viewed the 29th element of the vector.

In [18]:
class(df_data$Ozone)
'integer'

Missing values: is.na() provides a logical vector of missing values - TRUE if NA and FALSE if not NA. In R, TRUE = 1 and FALSE = 0. A sum function will output the sum of all TRUE values (or missing values).

Number of missing values in variable Solar.R:

In [21]:
# Number of missing values in variable Solar.R
sum(is.na(df_data$Solar.R))
7

Mean of the variable Solar.R excluding missing values:

In [23]:
# Mean of the variable Solar.R excluding missing values
mean(df_data$Solar.R, na.rm=TRUE)
185.931506849315

Mean of Solar.R where Ozone levels are lower than 25 and Temp is below 88:

In [25]:
# Mean of Solar.R where Ozone levels are lower than 25 and Temp is below 88.
mean(df_data$Solar.R[df_data$Ozone<25&df_data$Temp<88],na.rm=TRUE)
145.163265306122

Average Ozone level in the month of May:

In [27]:
# Average Ozone level in the month of May
mean(df_data$Ozone[df_data$Month==5], na.rm=TRUE)
23.6153846153846

Maximum Solar Radiation in June:

In [29]:
# Maximum Solar Radiation in June
# More info on usage of max() function: http://www.endmemo.com/program/R/max.php
max(df_data$Solar.R[df_data$Month==6],na.rm=TRUE)
332

[^top]

COVID data by JHU CSSE

Source: https://github.com/CSSEGISandData/COVID-19
Data URL: http://link.datascience.eu.org/p003d1
Data last updated: 2020-04-13

In [33]:
# Loading data to dataframe
df_JHU<-read.csv("http://link.datascience.eu.org/p003d1")
head(df_JHU, n=2)
FIPS Admin2 Province_State Country_Region Last_Update Lat Long_ Confirmed Deaths Recovered Active Combined_Key
45001 Abbeville South Carolina US 2020-04-13 23:07:54 34.22333 -82.46171 9 0 0 9 Abbeville, South Carolina, US
22001 Acadia Louisiana US 2020-04-13 23:07:54 30.29506 -92.41420 101 5 0 96 Acadia, Louisiana, US

Pruning dataframe by stripping away unrequired variables:
We only need Country, State, Confirmed, Deaths, Recovered and Active.

In [34]:
df_JHU<-df_JHU[,c("Country_Region","Province_State","Confirmed","Deaths","Recovered","Active")]
head(df_JHU,n=2)
Country_Region Province_State Confirmed Deaths Recovered Active
US South Carolina 9 0 0 9
US Louisiana 101 5 0 96

Confirmed cases, deaths and death rates in US and India vs Global:

In [39]:
# Confirmed cases
c_US<-sum(df_JHU[df_JHU$Country_Region=="US","Confirmed"])
c_IN<-sum(df_JHU[df_JHU$Country_Region=="India","Confirmed"])
c_GL<-sum(df_JHU["Confirmed"])
c<-c(c_US,c_IN,c_GL)
# Deaths
d_US<-sum(df_JHU[df_JHU$Country_Region=="US","Deaths"])
d_IN<-sum(df_JHU[df_JHU$Country_Region=="India","Deaths"])
d_GL<-sum(df_JHU["Deaths"])
d<-c(d_US,d_IN,d_GL)
# Death rates
dr<-d/c*100
# Creating a dataframe
df_dr<-cbind(Country=c("US","India","Global"),Confirmed=c,Deaths=d,Death_Rate=dr)
df_dr
Country Confirmed Deaths Death_Rate
US 580619 23529 4.0523992497662
India 10453 358 3.42485410886827
Global 1917320 119483 6.23177143095571

The numbers are close to the reports from https://www.worldometers.info/coronavirus/
Worldometer report

[^top]

In [40]:
df_kaggle<-read.csv("http://link.datascience.eu.org/p003d2")
head(df_kaggle,n=2)
SNo ObservationDate Province.State Country.Region Last.Update Confirmed Deaths Recovered
1 01/22/2020 Anhui Mainland China 1/22/2020 17:00 1 0 0
2 01/22/2020 Beijing Mainland China 1/22/2020 17:00 14 0 0

Pruning dataframe by stripping away unrequired variables:
We only need Country, State, Confirmed, Deaths, Recovered.

In [41]:
df_kaggle<-df_kaggle[,c("Country.Region","Province.State","Confirmed","Deaths","Recovered")]
head(df_kaggle,n=2)
Country.Region Province.State Confirmed Deaths Recovered
Mainland China Anhui 1 0 0
Mainland China Beijing 14 0 0

Confirmed cases, deaths and death rates in US and India vs Global:

In [42]:
# Confirmed cases
c_US<-sum(df_kaggle[df_kaggle$Country.Region=="US","Confirmed"])
c_IN<-sum(df_kaggle[df_kaggle$Country.Region=="India","Confirmed"])
c_GL<-sum(df_kaggle["Confirmed"])
c<-c(c_US,c_IN,c_GL)
# Deaths
d_US<-sum(df_kaggle[df_kaggle$Country.Region=="US","Deaths"])
d_IN<-sum(df_kaggle[df_kaggle$Country.Region=="India","Deaths"])
d_GL<-sum(df_kaggle["Deaths"])
d<-c(d_US,d_IN,d_GL)
# Death rates
dr<-d/c*100
# Creating a dataframe
df_dr<-cbind(Country=c("US","India","Global"),Confirmed=c,Deaths=d,Death_Rate=dr)
df_dr
Country Confirmed Deaths Death_Rate
US 6278333 193275 3.07844454889538
India 82548 2526 3.06003779619131
Global 29221460 1511343 5.1720311031687

The numbers are way off compared to the reports from https://www.worldometers.info/coronavirus/. The source data needs to be checked.
Worldometer report

[^top]

COVID by EU Open Data Portal

Source: https://opendata.ecdc.europa.eu/covid19/casedistribution/csv
Data URL: http://link.datascience.eu.org/p003d3
Data last updated: 2020-04-14

In [43]:
df_EU<-read.csv("http://link.datascience.eu.org/p003d3")
head(df_EU,n=2)
dateRep day month year cases deaths countriesAndTerritories geoId countryterritoryCode popData2018
14/04/2020 14 4 2020 58 3 Afghanistan AF AFG 37172386
13/04/2020 13 4 2020 52 0 Afghanistan AF AFG 37172386

Pruning dataframe by stripping away unrequired variables:
We only need Country, cases, deaths, population.

In [44]:
df_EU<-df_EU[,c("countriesAndTerritories","cases","deaths","popData2018")]
head(df_EU,n=2)
countriesAndTerritories cases deaths popData2018
Afghanistan 58 3 37172386
Afghanistan 52 0 37172386

Confirmed cases, deaths, death rates and cases per million population in US and India vs Global:

In [49]:
# cases cases
c_US<-sum(df_EU[df_EU$countriesAndTerritories=="United_States_of_America","cases"])
c_IN<-sum(df_EU[df_EU$countriesAndTerritories=="India","cases"])
c_GL<-sum(df_EU["cases"])
c<-c(c_US,c_IN,c_GL)
# deaths
d_US<-sum(df_EU[df_EU$countriesAndTerritories=="United_States_of_America","deaths"])
d_IN<-sum(df_EU[df_EU$countriesAndTerritories=="India","deaths"])
d_GL<-sum(df_EU["deaths"])
d<-c(d_US,d_IN,d_GL)
# Death rates
dr<-d/c*100
# Population
p_US<-df_EU[df_EU$countriesAndTerritories=="United_States_of_America","popData2018"][1]
p_IN<-df_EU[df_EU$countriesAndTerritories=="India","popData2018"][1]
p_GL<-7631091112     # World pop in 2018 per https://www.populationpyramid.net/world/2018/
p<-c(p_US,p_IN,p_GL)
# Cases per million population
cpm<-c/p*1000000
# Creating a dataframe
df_dr<-cbind(Country=c("United_States_of_America","India","Global"),cases=c,deaths=d,Death_Rate=dr, Cases_per_Mil=cpm)
df_dr
Country cases deaths Death_Rate Cases_per_Mil
United_States_of_America 582594 23649 4.05925910668493 1780.72124378981
India 10363 339 3.27125349802181 7.66144258651698
Global 1873265 118854 6.34475100960088 245.478002097795

The numbers are close to the reports from https://www.worldometers.info/coronavirus/
Worldometer report

[^top]

Last updated 2020-04-15 11:52:46.054857 IST

Base Graphics

swirl()

| Welcome to swirl! Please sign in. If you've been here before, use the same name as you
| did then. If you are new, call yourself something unique.

What shall I call you? Krishnakanth Allika

| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 1

| Please choose a lesson, or type 0 to return to course menu.

1: Basic Building Blocks 2: Workspace and Files 3: Sequences of Numbers
4: Vectors 5: Missing Values 6: Subsetting Vectors
7: Matrices and Data Frames 8: Logic 9: Functions
10: lapply and sapply 11: vapply and tapply 12: Looking at Data
13: Simulation 14: Dates and Times 15: Base Graphics

Selection: 15

| | 0%

| One of the greatest strengths of R, relative to other programming languages, is the
| ease with which we can create publication-quality graphics. In this lesson, you'll
| learn about base graphics in R.

...

|== | 2%
| We do not cover the more advanced portions of graphics in R in this lesson. These
| include lattice, ggplot2 and ggvis.

...

|=== | 4%
| There is a school of thought that this approach is backwards, that we should teach
| ggplot2 first. See http://varianceexplained.org/r/teach_ggplot2_to_beginners/ for an
| outline of this view.

...

|===== | 7%
| Load the included data frame cars with data(cars).

data(cars)

| Your dedication is inspiring!

|======= | 9%
| To fix ideas, we will work with simple data frames. Our main goal is to introduce
| various plotting functions and their arguments. All the output would look more
| interesting with larger, more complex data sets.

...

|========= | 11%
| Pull up the help page for cars.

?cars

| All that hard work is paying off!

|========== | 13%
| As you can see in the help page, the cars data set has only two variables: speed and
| stopping distance. Note that the data is from the 1920s.

...

|============ | 15%
| Run head() on the cars data.

head(cars)
speed dist
1 4 2
2 4 10
3 7 4
4 7 22
5 8 16
6 9 10

| You got it right!

|============== | 17%
| Before plotting, it is always a good idea to get a sense of the data. Key R commands
| for doing so include, dim(), names(), head(), tail() and summary().

...

|================ | 20%
| Run the plot() command on the cars data frame.

plot(cars)

plot(cars)

| You are amazing!

|================= | 22%
| As always, R tries very hard to give you something sensible given the information that
| you have provided to it. First, R notes that the data frame you have given it has just
| two columns, so it assumes that you want to plot one column versus the other.

...

|=================== | 24%
| Second, since we do not provide labels for either axis, R uses the names of the
| columns. Third, it creates axis tick marks at nice round numbers and labels them
| accordingly. Fourth, it uses the other defaults supplied in plot().

...

|===================== | 26%
| We will now spend some time exploring plot, but many of the topics covered here will
| apply to most other R graphics functions. Note that 'plot' is short for scatterplot.

...

|======================= | 28%
| Look up the help page for plot().

?plot

| All that hard work is paying off!

|======================== | 30%
| The help page for plot() highlights the different arguments that the function can take.
| The two most important are x and y, the variables that will be plotted. For the next
| set of questions, include the argument names in your answers. That is, do not type
| plot(cars$speed, cars$dist), although that will work. Instead use plot(x = cars$speed, | y = cars$dist).

...

|========================== | 33%
| Use plot() command to show speed on the x-axis and dist on the y-axis from the cars
| data frame. Use the form of the plot command in which vectors are explicitly passed in
| as arguments for x and y.

plot(x=cars$speed,y=cars$dist)

plot(x=cars$speed,y=cars$dist)

| You got it right!

|============================ | 35%
| Note that this produces a slightly different answer than plot(cars). In this case, R is
| not sure what you want to use as the labels on the axes, so it just uses the arguments
| which you pass in, data frame name and dollar signs included.

...

|============================== | 37%
| Note that there are other ways to call the plot command, i.e., using the "formula"
| interface. For example, we get a similar plot to the above with plot(dist ~ speed,
| cars). However, we will wait till later in the lesson before using the formula
| interface.

...

|=============================== | 39%
| Use plot() command to show dist on the x-axis and speed on the y-axis from the cars
| data frame. This is the opposite of what we did above.

plot(x=cars$dist,y=cars$speed)

plot(x=cars$dist,y=cars$speed)

| Nice work!

|================================= | 41%
| It probably makes more sense for speed to go on the x-axis since stopping distance is a
| function of speed more than the other way around. So, for the rest of the questions in
| this portion of the lesson, always assign the arguments accordingly.

...

|=================================== | 43%
| In fact, you can assume that the answers to the next few questions are all of the form
| plot(x = cars$speed, y = cars$dist, ...) but with various arguments used in place of
| the ...

...

|===================================== | 46%
| Recreate the plot with the label of the x-axis set to "Speed".

plot(x=cars$speed,y=cars$dist,xlab="Speed")

plot(x=cars$speed,y=cars$dist,xlab="Speed")')

| Perseverance, that's the answer.

|====================================== | 48%
| Recreate the plot with the label of the y-axis set to "Stopping Distance".

plot(x=cars$speed,y=cars$dist,xlab="Speed",ylab="Stopping Distance")

plot(x=cars$speed,y=cars$dist,xlab="Speed",ylab="Stopping Distance")')

| One more time. You can do it! Or, type info() for more options.

| Type plot(x = cars$speed, y = cars$dist, ylab = "Stopping Distance") to create the
| plot.

plot(x=cars$speed,y=cars$dist,ylab="Stopping Distance")

plot(x=cars$speed,y=cars$dist,ylab="Stopping Distance")')

| You are quite good my friend!

|======================================== | 50%
| Recreate the plot with "Speed" and "Stopping Distance" as axis labels.

plot(x=cars$speed,y=cars$dist,xlab="Speed",ylab="Stopping Distance")

plot(x=cars$speed,y=cars$dist,xlab="Speed",ylab="Stopping Distance")')

| Excellent work!

|========================================== | 52%
| The reason that plots(cars) worked at the beginning of the lesson was that R was smart
| enough to know that the first element (i.e., the first column) in cars should be
| assigned to the x argument and the second element to the y argument. To save on typing,
| the next set of answers will all be of the form, plot(cars, ...) with various arguments
| added.

...

|=========================================== | 54%
| For each question, we will only want one additional argument at a time. Of course, you
| can pass in more than one argument when doing a real project.

...

|============================================= | 57%
| Plot cars with a main title of "My Plot". Note that the argument for the main title is
| "main" not "title".

plot(cars,main="My Plot")

plot(cars,main="My Plot")')

| Excellent work!

|=============================================== | 59%
| Plot cars with a sub title of "My Plot Subtitle".

plot(cars,sub="My Plot Subtitle")

plot(cars,sub="My Plot Subtitle")')

| You are amazing!

|================================================= | 61%
| The plot help page (?plot) only covers a small number of the many arguments that can be
| passed in to plot() and to other graphical functions. To begin to explore the many
| other options, look at ?par. Let's look at some of the more commonly used ones.
| Continue using plot(cars, ...) as the base answer to these questions.

...

|================================================== | 63%
| Plot cars so that the plotted points are colored red. (Use col = 2 to achieve this
| effect.)

?par
plot(cars,col=2)

plot(cars,col=2)

| You are quite good my friend!

|==================================================== | 65%
| Plot cars while limiting the x-axis to 10 through 15. (Use xlim = c(10, 15) to achieve
| this effect.)

plot(cars,xlim=c(10,15))

plot(cars,xlim=c(10,15))

| Nice work!

|====================================================== | 67%
| You can also change the shape of the symbols in the plot. The help page for points
| (?points) provides the details.

...

|======================================================== | 70%
| Plot cars using triangles. (Use pch = 2 to achieve this effect.)

plot(cars,pch=2)

plot(cars,pch=2)

| All that hard work is paying off!

|========================================================= | 72%
| Arguments like "col" and "pch" may not seem very intuitive. And that is because they
| aren't! So, many/most people use more modern packages, like ggplot2, for creating their
| graphics in R.

...

|=========================================================== | 74%
| It is, however, useful to have an introduction to base graphics because many of the
| idioms in lattice and ggplot2 are modeled on them.

...

|============================================================= | 76%
| Let's now look at some other functions in base graphics that may be useful, starting
| with boxplots.

...

|=============================================================== | 78%
| Load the mtcars data frame.

data(mtcars)

| You are quite good my friend!

|================================================================ | 80%
| Anytime that you load up a new data frame, you should explore it before using it. In
| the middle of a swirl lesson, just type play(). This temporarily suspends the lesson
| (without losing the work you have already done) and allows you to issue commands like
| dim(mtcars) and head(mtcars). Once you are done examining the data, just type nxt() and
| the lesson will pick up where it left off.

...

|================================================================== | 83%
| Look up the help page for boxplot().

?boxplot

| Excellent job!

|==================================================================== | 85%
| Instead of adding data columns directly as input arguments, as we did with plot(), it
| is often handy to pass in the entire data frame. This is what the "data" argument in
| boxplot() allows.

...

|====================================================================== | 87%
| boxplot(), like many R functions, also takes a "formula" argument, generally an
| expression with a tilde ("~") which indicates the relationship between the input
| variables. This allows you to enter something like mpg ~ cyl to plot the relationship
| between cyl (number of cylinders) on the x-axis and mpg (miles per gallon) on the
| y-axis.

...

|======================================================================= | 89%
| Use boxplot() with formula = mpg ~ cyl and data = mtcars to create a box plot.

boxplot(formula=mpg~cyl,data=mtcars)

boxplot(formula=mpg~cyl,data=mtcars)

| All that practice is paying off!

|========================================================================= | 91%
| The plot shows that mpg is much lower for cars with more cylinders. Note that we can
| use the same set of arguments that we explored with plot() above to add axis labels,
| titles and so on.

...

|=========================================================================== | 93%
| When looking at a single variable, histograms are a useful tool. hist() is the
| associated R function. Like plot(), hist() is best used by just passing in a single
| vector.

...

|============================================================================= | 96%
| Use hist() with the vector mtcars$mpg to create a histogram.

hist(mtcars$mpg)

hist(mtcars$mpg)

| You are doing so well!

|============================================================================== | 98%
| In this lesson, you learned how to work with base graphics in R. The best place to go
| from here is to study the ggplot2 package. If you want to explore other elements of
| base graphics, then this web page (http://www.ling.upenn.edu/~joseff/rstudy/week4.html)
| provides a useful overview.

...

|================================================================================| 100%
| Would you like to receive credit for completing this course on Coursera.org?

1: No
2: Yes

Selection: 2
What is your email address? xxxxxx@xxxxxxxxxxxx
What is your assignment token? xXxXxxXXxXxxXXXx
Grade submission succeeded!

| You are amazing!

| You've reached the end of this lesson! Returning to the main menu...

| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 0

| Leaving swirl now. Type swirl() to resume.

ls()
[1] "cars" "mtcars"
rm(list=ls())

Last updated 2020-04-20 23:16:24.036922 IST

Dates and Times

swirl()

| Welcome to swirl! Please sign in. If you've been here before, use the same name as you
| did then. If you are new, call yourself something unique.

What shall I call you? Krishnakanth Allika

| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 1

| Please choose a lesson, or type 0 to return to course menu.

1: Basic Building Blocks 2: Workspace and Files 3: Sequences of Numbers
4: Vectors 5: Missing Values 6: Subsetting Vectors
7: Matrices and Data Frames 8: Logic 9: Functions
10: lapply and sapply 11: vapply and tapply 12: Looking at Data
13: Simulation 14: Dates and Times 15: Base Graphics

Selection: 14

| | 0%

| R has a special way of representing dates and times, which can be helpful if you're
| working with data that show how something changes over time (i.e. time-series data) or
| if your data contain some other temporal information, like dates of birth.

...

|== | 3%
| Dates are represented by the 'Date' class and times are represented by the 'POSIXct'
| and 'POSIXlt' classes. Internally, dates are stored as the number of days since
| 1970-01-01 and times are stored as either the number of seconds since 1970-01-01 (for
| 'POSIXct') or a list of seconds, minutes, hours, etc. (for 'POSIXlt').

...

|==== | 6%
| Let's start by using d1 <- Sys.Date() to get the current date and store it in the
| variable d1. (That's the letter 'd' and the number 1.)

d1<-Sys.Date()

| That's the answer I was looking for.

|======= | 8%
| Use the class() function to confirm d1 is a Date object.

class(d1)
[1] "Date"

| Keep working like that and you'll get there!

|========= | 11%
| We can use the unclass() function to see what d1 looks like internally. Try it out.

unclass(d1)
[1] 18367

| Keep up the great work!

|=========== | 14%
| That's the exact number of days since 1970-01-01!

...

|============= | 17%
| However, if you print d1 to the console, you'll get today's date -- YEAR-MONTH-DAY.
| Give it a try.

d1
[1] "2020-04-15"

| You got it!

|================ | 19%
| What if we need to reference a date prior to 1970-01-01? Create a variable d2
| containing as.Date("1969-01-01").

d2<-as.Date("1969-01-01")

| You nailed it! Good job!

|================== | 22%
| Now use unclass() again to see what d2 looks like internally.

unclass(d2)
[1] -365

| That's a job well done!

|==================== | 25%
| As you may have anticipated, you get a negative number. In this case, it's -365, since
| 1969-01-01 is exactly one calendar year (i.e. 365 days) BEFORE 1970-01-01.

...

|====================== | 28%
| Now, let's take a look at how R stores times. You can access the current date and time
| using the Sys.time() function with no arguments. Do this and store the result in a
| variable called t1.

t1<-Sys.time()

| You're the best!

|======================== | 31%
| View the contents of t1.

t1
[1] "2020-04-15 17:21:46 IST"

| You are doing so well!

|=========================== | 33%
| And check the class() of t1.

class(t1)
[1] "POSIXct" "POSIXt"

| All that hard work is paying off!

|============================= | 36%
| As mentioned earlier, POSIXct is just one of two ways that R represents time
| information. (You can ignore the second value above, POSIXt, which just functions as a
| common language between POSIXct and POSIXlt.) Use unclass() to see what t1 looks like
| internally -- the (large) number of seconds since the beginning of 1970.

unclass(t1)
[1] 1586951506

| You got it right!

|=============================== | 39%
| By default, Sys.time() returns an object of class POSIXct, but we can coerce the result
| to POSIXlt with as.POSIXlt(Sys.time()). Give it a try and store the result in t2.

t2<-as.POSIXlt(Sys.time())

| All that hard work is paying off!

|================================= | 42%
| Check the class of t2.

class(t2)
[1] "POSIXlt" "POSIXt"

| All that hard work is paying off!

|==================================== | 44%
| Now view its contents.

t2
[1] "2020-04-15 17:23:09 IST"

| That's the answer I was looking for.

|====================================== | 47%
| The printed format of t2 is identical to that of t1. Now unclass() t2 to see how it is
| different internally.

unclass(t2)
$sec
[1] 9.964687

$min
[1] 23

$hour
[1] 17

$mday
[1] 15

$mon
[1] 3

$year
[1] 120

$wday
[1] 3

$yday
[1] 105

$isdst
[1] 0

$zone
[1] "IST"

$gmtoff
[1] 19800

attr(,"tzone")
[1] "" "IST" "+0630"

| Keep up the great work!

|======================================== | 50%
| t2, like all POSIXlt objects, is just a list of values that make up the date and time.
| Use str(unclass(t2)) to have a more compact view.

str(unclass(t2))
List of 11
$ sec : num 9.96 $ min : int 23
$ hour : int 17 $ mday : int 15
$ mon : int 3 $ year : int 120
$ wday : int 3 $ yday : int 105
$ isdst : int 0 $ zone : chr "IST"
$ gmtoff: int 19800

  • attr(*, "tzone")= chr [1:3] "" "IST" "+0630"

| You are quite good my friend!

|========================================== | 53%
| If, for example, we want just the minutes from the time stored in t2, we can access
| them with t2$min. Give it a try.

t2$min
[1] 23

| Nice work!

|============================================ | 56%
| Now that we have explored all three types of date and time objects, let's look at a few
| functions that extract useful information from any of these objects -- weekdays(),
| months(), and quarters().

...

|=============================================== | 58%
| The weekdays() function will return the day of week from any date or time object. Try
| it out on d1, which is the Date object that contains today's date.

weekdays(d1)
[1] "Wednesday"

| Perseverance, that's the answer.

|================================================= | 61%
| The months() function also works on any date or time object. Try it on t1, which is the
| POSIXct object that contains the current time (well, it was the current time when you
| created it).

months(t1)
[1] "April"

| Excellent job!

|=================================================== | 64%
| The quarters() function returns the quarter of the year (Q1-Q4) from any date or time
| object. Try it on t2, which is the POSIXlt object that contains the time at which you
| created it.

quarters(t2)
[1] "Q2"

| Your dedication is inspiring!

|===================================================== | 67%
| Often, the dates and times in a dataset will be in a format that R does not recognize.
| The strptime() function can be helpful in this situation.

...

|======================================================== | 69%
| strptime() converts character vectors to POSIXlt. In that sense, it is similar to
| as.POSIXlt(), except that the input doesn't have to be in a particular format
| (YYYY-MM-DD).

...

|========================================================== | 72%
| To see how it works, store the following character string in a variable called t3:
| "October 17, 1986 08:24" (with the quotes).

t3<-"October 17, 1986 08:24"

| Keep working like that and you'll get there!

|============================================================ | 75%
| Now, use strptime(t3, "%B %d, %Y %H:%M") to help R convert our date/time object to a
| format that it understands. Assign the result to a new variable called t4. (You should
| pull up the documentation for strptime() if you'd like to know more about how it
| works.)

strptime(t3, "%B %d, %Y %H:%M")
[1] "1986-10-17 08:24:00 IST"

| Not quite, but you're learning! Try again. Or, type info() for more options.

| t4 <- strptime(t3, "%B %d, %Y %H:%M") will convert our date/time object to a format
| that R understands.

t4<-strptime(t3, "%B %d, %Y %H:%M")

| You got it right!

|============================================================== | 78%
| Print the contents of t4.

t4
[1] "1986-10-17 08:24:00 IST"

| You're the best!

|================================================================ | 81%
| That's the format we've come to expect. Now, let's check its class().

class(t4)
[1] "POSIXlt" "POSIXt"

| Great job!

|=================================================================== | 83%
| Finally, there are a number of operations that you can perform on dates and times,
| including arithmetic operations (+ and -) and comparisons (<, ==, etc.)

...

|===================================================================== | 86%
| The variable t1 contains the time at which you created it (recall you used Sys.time()).
| Confirm that some time has passed since you created t1 by using the 'greater than'
| operator to compare it to the current time: Sys.time() > t1

Sys.time()>t1
[1] TRUE

| That's correct!

|======================================================================= | 89%
| So we know that some time has passed, but how much? Try subtracting t1 from the current
| time using Sys.time() - t1. Don't forget the parentheses at the end of Sys.time(),
| since it is a function.

Sys.time()-t1
Time difference of 12.21747 mins

| You are amazing!

|========================================================================= | 92%
| The same line of thinking applies to addition and the other comparison operators. If
| you want more control over the units when finding the above difference in times, you
| can use difftime(), which allows you to specify a 'units' parameter.

...

|============================================================================ | 94%
| Use difftime(Sys.time(), t1, units = 'days') to find the amount of time in DAYS that
| has passed since you created t1.

difftime(Sys.time(), t1, units = 'days')
Time difference of 0.01950932 days

| That's the answer I was looking for.

|============================================================================== | 97%
| In this lesson, you learned how to work with dates and times in R. While it is
| important to understand the basics, if you find yourself working with dates and times
| often, you may want to check out the lubridate package by Hadley Wickham.

...

|================================================================================| 100%
| Would you like to receive credit for completing this course on Coursera.org?

1: No
2: Yes

Selection: 2
What is your email address? xxxxxx@xxxxxxxxxxxx
What is your assignment token? xXxXxxXXxXxxXXXx
Grade submission succeeded!

| Perseverance, that's the answer.

| You've reached the end of this lesson! Returning to the main menu...

| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 0

| Leaving swirl now. Type swirl() to resume.

ls()
[1] "d1" "d2" "dl" "t1" "t2" "t3" "t4"
rm(list=ls())

Last updated 2020-04-20 23:37:33.781884 IST

Simulation

swirl()

| Welcome to swirl! Please sign in. If you've been here before, use the same name as you
| did then. If you are new, call yourself something unique.

What shall I call you? Krishnakanth Allika

| Please choose a course, or type 0 to exit swirl.
| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 1

| Please choose a lesson, or type 0 to return to course menu.

1: Basic Building Blocks 2: Workspace and Files 3: Sequences of Numbers
4: Vectors 5: Missing Values 6: Subsetting Vectors
7: Matrices and Data Frames 8: Logic 9: Functions
10: lapply and sapply 11: vapply and tapply 12: Looking at Data
13: Simulation 14: Dates and Times 15: Base Graphics

Selection: 13

| | 0%

| One of the great advantages of using a statistical programming language like R is its
| vast collection of tools for simulating random numbers.

...

|== | 3%
| This lesson assumes familiarity with a few common probability distributions, but these
| topics will only be discussed with respect to random number generation. Even if you
| have no prior experience with these concepts, you should be able to complete the lesson
| and understand the main ideas.

...

|===== | 6%
| The first function we'll use to generate random numbers is sample(). Use ?sample to
| pull up the documentation.

?sample

| Nice work!

|======= | 9%
| Let's simulate rolling four six-sided dice: sample(1:6, 4, replace = TRUE).

sample(1:6,4,replace=TRUE)
[1] 3 5 5 3

| Keep up the great work!

|========== | 12%
| Now repeat the command to see how your result differs. (The probability of rolling the
| exact same result is (1/6)^4 = 0.00077, which is pretty small!)

sample(1:6,4,replace=TRUE)
[1] 2 6 6 3

| Keep working like that and you'll get there!

|============ | 15%
| sample(1:6, 4, replace = TRUE) instructs R to randomly select four numbers between 1
| and 6, WITH replacement. Sampling with replacement simply means that each number is
| "replaced" after it is selected, so that the same number can show up more than once.
| This is what we want here, since what you roll on one die shouldn't affect what you
| roll on any of the others.

...

|=============== | 18%
| Now sample 10 numbers between 1 and 20, WITHOUT replacement. To sample without
| replacement, simply leave off the 'replace' argument.

sample(1:20,10)
[1] 11 1 12 10 9 20 17 8 7 5

| Keep working like that and you'll get there!

|================= | 21%
| Since the last command sampled without replacement, no number appears more than once in
| the output.

...

|=================== | 24%
| LETTERS is a predefined variable in R containing a vector of all 26 letters of the
| English alphabet. Take a look at it now.

LETTERS
[1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S" "T" "U"
[22] "V" "W" "X" "Y" "Z"

| You got it!

|====================== | 27%
| The sample() function can also be used to permute, or rearrange, the elements of a
| vector. For example, try sample(LETTERS) to permute all 26 letters of the English
| alphabet.

sample(LETTERS)
[1] "A" "G" "K" "S" "B" "C" "U" "D" "P" "I" "Z" "V" "M" "R" "E" "Q" "Y" "N" "F" "T" "L"
[22] "J" "X" "O" "H" "W"

| You are amazing!

|======================== | 30%
| This is identical to taking a sample of size 26 from LETTERS, without replacement. When
| the 'size' argument to sample() is not specified, R takes a sample equal in size to the
| vector from which you are sampling.

...

|=========================== | 33%
| Now, suppose we want to simulate 100 flips of an unfair two-sided coin. This particular
| coin has a 0.3 probability of landing 'tails' and a 0.7 probability of landing 'heads'.

...

|============================= | 36%
| Let the value 0 represent tails and the value 1 represent heads. Use sample() to draw a
| sample of size 100 from the vector c(0,1), with replacement. Since the coin is unfair,
| we must attach specific probabilities to the values 0 (tails) and 1 (heads) with a
| fourth argument, prob = c(0.3, 0.7). Assign the result to a new variable called flips.

flips<-sample(c(0,1),100,prob=c(0.3,0.7))
Error in sample.int(length(x), size, replace, prob) :
cannot take a sample larger than the population when 'replace = FALSE'
flips<-sample(c(0,1),100,prob=c(0.3,0.7),replace = TRUE)

| Keep working like that and you'll get there!

|================================ | 39%
| View the contents of the flips variable.

flips
[1] 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 1 1 0 0 0 1 0 0 1 1 1 1 0 1 1 0
[43] 1 0 1 0 1 1 0 1 1 1 1 1 0 0 1 0 1 1 0 0 0 1 1 1 1 0 0 1 0 1 1 1 1 1 1 0 0 1 0 1 1 1
[85] 0 1 1 0 1 1 1 0 1 1 1 0 1 0 0 1

| All that hard work is paying off!

|================================== | 42%
| Since we set the probability of landing heads on any given flip to be 0.7, we'd expect
| approximately 70 of our coin flips to have the value 1. Count the actual number of 1s
| contained in flips using the sum() function.

sum(flips)
[1] 67

| You are quite good my friend!

|==================================== | 45%
| A coin flip is a binary outcome (0 or 1) and we are performing 100 independent trials
| (coin flips), so we can use rbinom() to simulate a binomial random variable. Pull up
| the documentation for rbinom() using ?rbinom.

?rbinom

| You nailed it! Good job!

|======================================= | 48%
| Each probability distribution in R has an r function (for "random"), a d function
| (for "density"), a p (for "probability"), and q (for "quantile"). We are most
| interested in the r*** functions in this lesson, but I encourage you to explore the
| others on your own.

...

|========================================= | 52%
| A binomial random variable represents the number of 'successes' (heads) in a given
| number of independent 'trials' (coin flips). Therefore, we can generate a single random
| variable that represents the number of heads in 100 flips of our unfair coin using
| rbinom(1, size = 100, prob = 0.7). Note that you only specify the probability of
| 'success' (heads) and NOT the probability of 'failure' (tails). Try it now.

rbinom(1,size=100,prob=0.7)
[1] 64

| Perseverance, that's the answer.

|============================================ | 55%
| Equivalently, if we want to see all of the 0s and 1s, we can request 100 observations,
| each of size 1, with success probability of 0.7. Give it a try, assigning the result to
| a new variable called flips2.

flips2<-rbinom(100,size=100,prob=0.7)

| Not quite, but you're learning! Try again. Or, type info() for more options.

| Call rbinom() with n = 100, size = 1, and prob = 0.7 and assign the result to flips2.

flips2<-rbinom(100,size=1,prob=0.7)

| You're the best!

|============================================== | 58%
| View the contents of flips2.

flips2
[1] 1 1 1 1 1 0 1 0 1 1 0 1 1 1 0 1 1 1 0 0 1 0 1 1 1 1 1 0 0 1 0 0 1 1 1 1 1 1 1 0 1 1
[43] 1 0 1 0 1 1 1 1 1 1 0 1 0 1 1 1 0 1 1 1 1 1 1 0 0 0 1 1 1 1 0 1 1 1 1 1 0 0 0 1 1 1
[85] 1 0 1 1 1 0 0 1 1 1 1 1 1 1 1 1

| You are really on a roll!

|================================================ | 61%
| Now use sum() to count the number of 1s (heads) in flips2. It should be close to 70!

sum(flips2)
[1] 73

| Great job!

|=================================================== | 64%
| Similar to rbinom(), we can use R to simulate random numbers from many other
| probability distributions. Pull up the documentation for rnorm() now.

?rnorm

| You got it right!

|===================================================== | 67%
| The standard normal distribution has mean 0 and standard deviation 1. As you can see
| under the 'Usage' section in the documentation, the default values for the 'mean' and
| 'sd' arguments to rnorm() are 0 and 1, respectively. Thus, rnorm(10) will generate 10
| random numbers from a standard normal distribution. Give it a try.

rnorm(10)
[1] 0.3573471 0.7579807 -2.3097147 -0.1032675 -1.9347451 -0.2738356 -0.3452365
[8] 0.7985894 0.9606335 0.7152267

| Your dedication is inspiring!

|======================================================== | 70%
| Now do the same, except with a mean of 100 and a standard deviation of 25.

rnorm(10,mean=100,sd=25)
[1] 94.27487 100.14298 100.07092 93.08279 61.79063 102.18774 108.74859 103.19485
[9] 123.83857 143.24210

| Nice work!

|========================================================== | 73%
| Finally, what if we want to simulate 100 groups of random numbers, each containing 5
| values generated from a Poisson distribution with mean 10? Let's start with one group
| of 5 numbers, then I'll show you how to repeat the operation 100 times in a convenient
| and compact way.

...

|============================================================= | 76%
| Generate 5 random values from a Poisson distribution with mean 10. Check out the
| documentation for rpois() if you need help.

?rpois
rpois(5,10)
[1] 14 12 13 8 8

| Your dedication is inspiring!

|=============================================================== | 79%
| Now use replicate(100, rpois(5, 10)) to perform this operation 100 times. Store the
| result in a new variable called my_pois.

my_pois<-replicate(100,rpois(5,10))

| You got it right!

|================================================================= | 82%
| Take a look at the contents of my_pois.

my_pois
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15]
[1,] 7 8 7 12 16 10 11 8 16 11 8 6 13 7 9
[2,] 8 8 7 12 13 11 15 12 14 14 8 12 8 11 10
[3,] 12 7 12 7 9 9 16 4 11 13 16 19 13 8 8
[4,] 8 7 11 4 5 10 10 11 9 12 8 10 10 8 6
[5,] 13 8 15 11 8 8 7 13 8 4 14 8 11 5 14
[,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28] [,29]
[1,] 9 7 10 6 8 9 9 12 6 9 1 13 8 10
[2,] 12 11 8 11 9 5 2 6 11 9 16 12 10 12
[3,] 13 14 6 8 7 8 9 7 16 12 15 11 5 11
[4,] 17 11 12 9 16 11 8 10 4 8 15 9 6 10
[5,] 8 10 10 8 12 15 8 8 12 7 9 8 14 11
[,30] [,31] [,32] [,33] [,34] [,35] [,36] [,37] [,38] [,39] [,40] [,41] [,42] [,43]
[1,] 10 8 5 5 5 6 13 7 11 8 15 10 12 5
[2,] 17 9 8 11 11 11 7 15 10 9 6 10 8 8
[3,] 13 17 14 8 12 10 8 13 9 4 10 14 13 8
[4,] 6 17 12 11 13 10 16 3 9 8 4 12 11 15
[5,] 8 4 10 11 17 7 13 8 5 9 8 10 8 8
[,44] [,45] [,46] [,47] [,48] [,49] [,50] [,51] [,52] [,53] [,54] [,55] [,56] [,57]
[1,] 16 11 15 9 10 14 13 11 8 6 11 11 14 9
[2,] 11 12 10 7 7 7 2 11 8 9 12 6 10 10
[3,] 8 9 9 9 10 12 4 17 9 7 17 8 11 10
[4,] 14 10 13 11 5 5 8 9 5 14 9 15 9 16
[5,] 13 9 9 6 6 8 6 9 11 16 13 7 6 14
[,58] [,59] [,60] [,61] [,62] [,63] [,64] [,65] [,66] [,67] [,68] [,69] [,70] [,71]
[1,] 9 9 9 10 9 7 9 8 12 13 9 7 9 9
[2,] 8 8 13 15 10 7 16 5 13 8 12 10 10 10
[3,] 5 4 6 10 11 11 17 8 12 5 6 9 6 13
[4,] 10 12 9 11 9 15 10 6 9 14 7 11 14 11
[5,] 13 11 11 8 13 14 4 4 13 9 7 9 12 10
[,72] [,73] [,74] [,75] [,76] [,77] [,78] [,79] [,80] [,81] [,82] [,83] [,84] [,85]
[1,] 9 9 11 9 14 9 12 15 5 3 6 4 12 8
[2,] 16 8 6 9 10 8 9 3 12 7 11 10 10 12
[3,] 9 14 11 16 11 18 7 13 8 11 10 16 10 12
[4,] 11 11 11 11 9 14 6 5 8 14 14 8 16 14
[5,] 15 11 9 18 13 10 9 6 8 11 7 6 11 15
[,86] [,87] [,88] [,89] [,90] [,91] [,92] [,93] [,94] [,95] [,96] [,97] [,98] [,99]
[1,] 9 10 11 14 9 12 16 7 4 7 13 6 12 10
[2,] 14 15 9 12 13 11 10 5 10 15 12 10 9 7
[3,] 7 15 13 7 12 7 8 13 9 15 8 11 6 15
[4,] 10 10 10 6 11 16 12 9 6 10 10 13 10 9
[5,] 6 7 5 12 11 9 13 9 13 4 12 11 8 11
[,100]
[1,] 10
[2,] 8
[3,] 11
[4,] 10
[5,] 19

| You got it!

|==================================================================== | 85%
| replicate() created a matrix, each column of which contains 5 random numbers generated
| from a Poisson distribution with mean 10. Now we can find the mean of each column in
| my_pois using the colMeans() function. Store the result in a variable called cm.

cm<-colMeans(my_pois)

| You're the best!

|====================================================================== | 88%
| And let's take a look at the distribution of our column means by plotting a histogram
| with hist(cm).

hist(cm)

hist(cm)

| That's the answer I was looking for.

|========================================================================= | 91%
| Looks like our column means are almost normally distributed, right? That's the Central
| Limit Theorem at work, but that's a lesson for another day!

...

|=========================================================================== | 94%
| All of the standard probability distributions are built into R, including exponential
| (rexp()), chi-squared (rchisq()), gamma (rgamma()), .... Well, you see the pattern.

...

|============================================================================== | 97%
| Simulation is practically a field of its own and we've only skimmed the surface of
| what's possible. I encourage you to explore these and other functions further on your
| own.

...

|================================================================================| 100%
| Would you like to receive credit for completing this course on Coursera.org?

1: Yes
2: No

Selection: 1
What is your email address? xxxxxx@xxxxxxxxxxxx
What is your assignment token? xXxXxxXXxXxxXXXx
Grade submission succeeded!

| You are amazing!

| You've reached the end of this lesson! Returning to the main menu...

| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 0

| Leaving swirl now. Type swirl() to resume.

ls()
[1] "cm" "flips" "flips2" "my_pois"
rm(list=ls())

Last updated 2020-04-20 22:36:53.783177 IST

Looking at Data

swirl()

| Welcome to swirl! Please sign in. If you've been here before, use the same name as you
| did then. If you are new, call yourself something unique.

What shall I call you? Krishnakanth Allika

| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 1

| Please choose a lesson, or type 0 to return to course menu.

1: Basic Building Blocks 2: Workspace and Files 3: Sequences of Numbers
4: Vectors 5: Missing Values 6: Subsetting Vectors
7: Matrices and Data Frames 8: Logic 9: Functions
10: lapply and sapply 11: vapply and tapply 12: Looking at Data
13: Simulation 14: Dates and Times 15: Base Graphics

Selection: 12

| | 0%

| Whenever you're working with a new dataset, the first thing you should do is look at
| it! What is the format of the data? What are the dimensions? What are the variable
| names? How are the variables stored? Are there missing data? Are there any flaws in the
| data?

...

|=== | 4%
| This lesson will teach you how to answer these questions and more using R's built-in
| functions. We'll be using a dataset constructed from the United States Department of
| Agriculture's PLANTS Database (http://plants.usda.gov/adv_search.html).

...

|====== | 8%
| I've stored the data for you in a variable called plants. Type ls() to list the
| variables in your workspace, among which should be plants.

play()

| Entering play mode. Experiment as you please, then type nxt() when you are ready to
| resume the lesson.

write.csv(plants,"plants.csv")
nxt()

| Resuming lesson...

| I've stored the data for you in a variable called plants. Type ls() to list the
| variables in your workspace, among which should be plants.

ls()
[1] "plants"

| You got it!

|========== | 12%
| Let's begin by checking the class of the plants variable with class(plants). This will
| give us a clue as to the overall structure of the data.

class(plants)
[1] "data.frame"

| Great job!

|============= | 16%
| It's very common for data to be stored in a data frame. It is the default class for
| data read into R using functions like read.csv() and read.table(), which you'll learn
| about in another lesson.

...

|================ | 20%
| Since the dataset is stored in a data frame, we know it is rectangular. In other words,
| it has two dimensions (rows and columns) and fits neatly into a table or spreadsheet.
| Use dim(plants) to see exactly how many rows and columns we're dealing with.

dim(plants)
[1] 5166 10

| Keep up the great work!

|=================== | 24%
| The first number you see (5166) is the number of rows (observations) and the second
| number (10) is the number of columns (variables).

...

|====================== | 28%
| You can also use nrow(plants) to see only the number of rows. Try it out.

nrow(plants)
[1] 5166

| You nailed it! Good job!

|========================== | 32%
| ... And ncol(plants) to see only the number of columns.

ncol(plants)
[1] 10

| You got it!

|============================= | 36%
| If you are curious as to how much space the dataset is occupying in memory, you can use
| object.size(plants).

object.size(plants)
686080 bytes

| Perseverance, that's the answer.

|================================ | 40%
| Now that we have a sense of the shape and size of the dataset, let's get a feel for
| what's inside. names(plants) will return a character vector of column (i.e. variable)
| names. Give it a shot.

names(plants)
[1] "Scientific_Name" "Duration" "Active_Growth_Period"
[4] "Foliage_Color" "pH_Min" "pH_Max"
[7] "Precip_Min" "Precip_Max" "Shade_Tolerance"
[10] "Temp_Min_F"

| You are quite good my friend!

|=================================== | 44%
| We've applied fairly descriptive variable names to this dataset, but that won't always
| be the case. A logical next step is to peek at the actual data. However, our dataset
| contains over 5000 observations (rows), so it's impractical to view the whole thing all
| at once.

...

|====================================== | 48%
| The head() function allows you to preview the top of the dataset. Give it a try with
| only one argument.

head(plants)
Scientific_Name Duration Active_Growth_Period Foliage_Color
1 Abelmoschus
2 Abelmoschus esculentus Annual, Perennial
3 Abies
4 Abies balsamea Perennial Spring and Summer Green
5 Abies balsamea var. balsamea Perennial
6 Abutilon
pH_Min pH_Max Precip_Min Precip_Max Shade_Tolerance Temp_Min_F
1 NA NA NA NA NA
2 NA NA NA NA NA
3 NA NA NA NA NA
4 4 6 13 60 Tolerant -43
5 NA NA NA NA NA
6 NA NA NA NA NA

| All that hard work is paying off!

|========================================== | 52%
| Take a minute to look through and understand the output above. Each row is labeled with
| the observation number and each column with the variable name. Your screen is probably
| not wide enough to view all 10 columns side-by-side, in which case R displays as many
| columns as it can on each line before continuing on the next.

...

|============================================= | 56%
| By default, head() shows you the first six rows of the data. You can alter this
| behavior by passing as a second argument the number of rows you'd like to view. Use
| head() to preview the first 10 rows of plants.

head(plants,n=10)
Scientific_Name Duration Active_Growth_Period Foliage_Color
1 Abelmoschus
2 Abelmoschus esculentus Annual, Perennial
3 Abies
4 Abies balsamea Perennial Spring and Summer Green
5 Abies balsamea var. balsamea Perennial
6 Abutilon
7 Abutilon theophrasti Annual
8 Acacia
9 Acacia constricta Perennial Spring and Summer Green
10 Acacia constricta var. constricta Perennial
pH_Min pH_Max Precip_Min Precip_Max Shade_Tolerance Temp_Min_F
1 NA NA NA NA NA
2 NA NA NA NA NA
3 NA NA NA NA NA
4 4 6.0 13 60 Tolerant -43
5 NA NA NA NA NA
6 NA NA NA NA NA
7 NA NA NA NA NA
8 NA NA NA NA NA
9 7 8.5 4 20 Intolerant -13
10 NA NA NA NA NA

| You are really on a roll!

|================================================ | 60%
| The same applies for using tail() to preview the end of the dataset. Use tail() to view
| the last 15 rows.

tail(plants,n=15)
Scientific_Name Duration Active_Growth_Period Foliage_Color pH_Min
5152 Zizania NA
5153 Zizania aquatica Annual Spring Green 6.4
5154 Zizania aquatica var. aquatica Annual NA
5155 Zizania palustris Annual NA
5156 Zizania palustris var. palustris Annual NA
5157 Zizaniopsis NA
5158 Zizaniopsis miliacea Perennial Spring and Summer Green 4.3
5159 Zizia NA
5160 Zizia aptera Perennial NA
5161 Zizia aurea Perennial NA
5162 Zizia trifoliata Perennial NA
5163 Zostera NA
5164 Zostera marina Perennial NA
5165 Zoysia NA
5166 Zoysia japonica Perennial NA
pH_Max Precip_Min Precip_Max Shade_Tolerance Temp_Min_F
5152 NA NA NA NA
5153 7.4 30 50 Intolerant 32
5154 NA NA NA NA
5155 NA NA NA NA
5156 NA NA NA NA
5157 NA NA NA NA
5158 9.0 35 70 Intolerant 12
5159 NA NA NA NA
5160 NA NA NA NA
5161 NA NA NA NA
5162 NA NA NA NA
5163 NA NA NA NA
5164 NA NA NA NA
5165 NA NA NA NA
5166 NA NA NA NA

| You're the best!

|=================================================== | 64%
| After previewing the top and bottom of the data, you probably noticed lots of NAs,
| which are R's placeholders for missing values. Use summary(plants) to get a better feel
| for how each variable is distributed and how much of the dataset is missing.

summary(plants)
Scientific_Name Duration
Abelmoschus : 1 Perennial :3031
Abelmoschus esculentus : 1 Annual : 682
Abies : 1 Annual, Perennial: 179
Abies balsamea : 1 Annual, Biennial : 95
Abies balsamea var. balsamea: 1 Biennial : 57
Abutilon : 1 (Other) : 92
(Other) :5160 NA's :1030
Active_Growth_Period Foliage_Color pH_Min pH_Max
Spring and Summer : 447 Dark Green : 82 Min. :3.000 Min. : 5.100
Spring : 144 Gray-Green : 25 1st Qu.:4.500 1st Qu.: 7.000
Spring, Summer, Fall: 95 Green : 692 Median :5.000 Median : 7.300
Summer : 92 Red : 4 Mean :4.997 Mean : 7.344
Summer and Fall : 24 White-Gray : 9 3rd Qu.:5.500 3rd Qu.: 7.800
(Other) : 30 Yellow-Green: 20 Max. :7.000 Max. :10.000
NA's :4334 NA's :4334 NA's :4327 NA's :4327
Precip_Min Precip_Max Shade_Tolerance Temp_Min_F
Min. : 4.00 Min. : 16.00 Intermediate: 242 Min. :-79.00
1st Qu.:16.75 1st Qu.: 55.00 Intolerant : 349 1st Qu.:-38.00
Median :28.00 Median : 60.00 Tolerant : 246 Median :-33.00
Mean :25.57 Mean : 58.73 NA's :4329 Mean :-22.53
3rd Qu.:32.00 3rd Qu.: 60.00 3rd Qu.:-18.00
Max. :60.00 Max. :200.00 Max. : 52.00
NA's :4338 NA's :4338 NA's :4328

| Keep up the great work!

|====================================================== | 68%
| summary() provides different output for each variable, depending on its class. For
| numeric data such as Precip_Min, summary() displays the minimum, 1st quartile, median,
| mean, 3rd quartile, and maximum. These values help us understand how the data are
| distributed.

...

|========================================================== | 72%
| For categorical variables (called 'factor' variables in R), summary() displays the
| number of times each value (or 'level') occurs in the data. For example, each value of
| Scientific_Name only appears once, since it is unique to a specific plant. In contrast,
| the summary for Duration (also a factor variable) tells us that our dataset contains
| 3031 Perennial plants, 682 Annual plants, etc.

...

|============================================================= | 76%
| You can see that R truncated the summary for Active_Growth_Period by including a
| catch-all category called 'Other'. Since it is a categorical/factor variable, we can
| see how many times each value actually occurs in the data with
| table(plants$Active_Growth_Period).

table(plants$Active_Growth_Period)

Fall, Winter and Spring Spring Spring and Fall
15 144 10
Spring and Summer Spring, Summer, Fall Summer
447 95 92
Summer and Fall Year Round
24 5

| Your dedication is inspiring!

|================================================================ | 80%
| Each of the functions we've introduced so far has its place in helping you to better
| understand the structure of your data. However, we've left the best for last....

...

|=================================================================== | 84%
| Perhaps the most useful and concise function for understanding the structure of your
| data is str(). Give it a try now.

str(plants)
'data.frame': 5166 obs. of 10 variables:
$ Scientific_Name : Factor w/ 5166 levels "Abelmoschus",..: 1 2 3 4 5 6 7 8 9 10 ... $ Duration : Factor w/ 8 levels "Annual","Annual, Biennial",..: NA 4 NA 7 7 NA 1 NA 7 7 ...
$ Active_Growth_Period: Factor w/ 8 levels "Fall, Winter and Spring",..: NA NA NA 4 NA NA NA NA 4 NA ... $ Foliage_Color : Factor w/ 6 levels "Dark Green","Gray-Green",..: NA NA NA 3 NA NA NA NA 3 NA ...
$ pH_Min : num NA NA NA 4 NA NA NA NA 7 NA ... $ pH_Max : num NA NA NA 6 NA NA NA NA 8.5 NA ...
$ Precip_Min : int NA NA NA 13 NA NA NA NA 4 NA ... $ Precip_Max : int NA NA NA 60 NA NA NA NA 20 NA ...
$ Shade_Tolerance : Factor w/ 3 levels "Intermediate",..: NA NA NA 3 NA NA NA NA 2 NA ... $ Temp_Min_F : int NA NA NA -43 NA NA NA NA -13 NA ...

| That's correct!

|====================================================================== | 88%
| The beauty of str() is that it combines many of the features of the other functions
| you've already seen, all in a concise and readable format. At the very top, it tells us
| that the class of plants is 'data.frame' and that it has 5166 observations and 10
| variables. It then gives us the name and class of each variable, as well as a preview
| of its contents.

...

|========================================================================== | 92%
| str() is actually a very general function that you can use on most objects in R. Any
| time you want to understand the structure of something (a dataset, function, etc.),
| str() is a good place to start.

...

|============================================================================= | 96%
| In this lesson, you learned how to get a feel for the structure and contents of a new
| dataset using a collection of simple and useful functions. Taking the time to do this
| upfront can save you time and frustration later on in your analysis.

...

|================================================================================| 100%
| Would you like to receive credit for completing this course on Coursera.org?

1: No
2: Yes

Selection: 2
What is your email address? xxxxxx@xxxxxxxxxxxx
What is your assignment token? xXxXxxXXxXxxXXXx
Grade submission succeeded!

| All that hard work is paying off!

| You've reached the end of this lesson! Returning to the main menu...

| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 0

| Leaving swirl now. Type swirl() to resume.

ls()
[1] "plants"
rm(list=ls())

Last updated 2020-04-20 21:28:58.953203 IST

vapply and tapply

swirl()

| Welcome to swirl! Please sign in. If you've been here before, use the same name as you
| did then. If you are new, call yourself something unique.

What shall I call you? Krishnakanth Allika

| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 1

| Please choose a lesson, or type 0 to return to course menu.

1: Basic Building Blocks 2: Workspace and Files 3: Sequences of Numbers
4: Vectors 5: Missing Values 6: Subsetting Vectors
7: Matrices and Data Frames 8: Logic 9: Functions
10: lapply and sapply 11: vapply and tapply 12: Looking at Data
13: Simulation 14: Dates and Times 15: Base Graphics

Selection: 11

| | 0%

| In the last lesson, you learned about the two most fundamental members of R's *apply
| family of functions: lapply() and sapply(). Both take a list as input, apply a function
| to each element of the list, then combine and return the result. lapply() always
| returns a list, whereas sapply() attempts to simplify the result.

...

|=== | 4%
| In this lesson, you'll learn how to use vapply() and tapply(), each of which serves a
| very specific purpose within the Split-Apply-Combine methodology. For consistency,
| we'll use the same dataset we used in the 'lapply and sapply' lesson.

...

|====== | 8%
| The Flags dataset from the UCI Machine Learning Repository contains details of various
| nations and their flags. More information may be found here:
| http://archive.ics.uci.edu/ml/datasets/Flags

...

|========== | 12%
| I've stored the data in a variable called flags. If it's been a while since you
| completed the 'lapply and sapply' lesson, you may want to reacquaint yourself with the
| data by using functions like dim(), head(), str(), and summary() when you return to the
| prompt (>). You can also type viewinfo() at the prompt to bring up some documentation
| for the dataset. Let's get started!

...

|============= | 16%
| As you saw in the last lesson, the unique() function returns a vector of the unique
| values contained in the object passed to it. Therefore, sapply(flags, unique) returns a
| list containing one vector of unique values for each column of the flags dataset. Try
| it again now.

sapply(flags, unique)
$name
[1] Afghanistan Albania Algeria
[4] American-Samoa Andorra Angola
[7] Anguilla Antigua-Barbuda Argentina
[10] Argentine Australia Austria
[13] Bahamas Bahrain Bangladesh
[16] Barbados Belgium Belize
[19] Benin Bermuda Bhutan
[22] Bolivia Botswana Brazil
[25] British-Virgin-Isles Brunei Bulgaria
[28] Burkina Burma Burundi
[31] Cameroon Canada Cape-Verde-Islands
[34] Cayman-Islands Central-African-Republic Chad
[37] Chile China Colombia
[40] Comorro-Islands Congo Cook-Islands
[43] Costa-Rica Cuba Cyprus
[46] Czechoslovakia Denmark Djibouti
[49] Dominica Dominican-Republic Ecuador
[52] Egypt El-Salvador Equatorial-Guinea
[55] Ethiopia Faeroes Falklands-Malvinas
[58] Fiji Finland France
[61] French-Guiana French-Polynesia Gabon
[64] Gambia Germany-DDR Germany-FRG
[67] Ghana Gibraltar Greece
[70] Greenland Grenada Guam
[73] Guatemala Guinea Guinea-Bissau
[76] Guyana Haiti Honduras
[79] Hong-Kong Hungary Iceland
[82] India Indonesia Iran
[85] Iraq Ireland Israel
[88] Italy Ivory-Coast Jamaica
[91] Japan Jordan Kampuchea
[94] Kenya Kiribati Kuwait
[97] Laos Lebanon Lesotho
[100] Liberia Libya Liechtenstein
[103] Luxembourg Malagasy Malawi
[106] Malaysia Maldive-Islands Mali
[109] Malta Marianas Mauritania
[112] Mauritius Mexico Micronesia
[115] Monaco Mongolia Montserrat
[118] Morocco Mozambique Nauru
[121] Nepal Netherlands Netherlands-Antilles
[124] New-Zealand Nicaragua Niger
[127] Nigeria Niue North-Korea
[130] North-Yemen Norway Oman
[133] Pakistan Panama Papua-New-Guinea
[136] Parguay Peru Philippines
[139] Poland Portugal Puerto-Rico
[142] Qatar Romania Rwanda
[145] San-Marino Sao-Tome Saudi-Arabia
[148] Senegal Seychelles Sierra-Leone
[151] Singapore Soloman-Islands Somalia
[154] South-Africa South-Korea South-Yemen
[157] Spain Sri-Lanka St-Helena
[160] St-Kitts-Nevis St-Lucia St-Vincent
[163] Sudan Surinam Swaziland
[166] Sweden Switzerland Syria
[169] Taiwan Tanzania Thailand
[172] Togo Tonga Trinidad-Tobago
[175] Tunisia Turkey Turks-Cocos-Islands
[178] Tuvalu UAE Uganda
[181] UK Uruguay US-Virgin-Isles
[184] USA USSR Vanuatu
[187] Vatican-City Venezuela Vietnam
[190] Western-Samoa Yugoslavia Zaire
[193] Zambia Zimbabwe
194 Levels: Afghanistan Albania Algeria American-Samoa Andorra Angola ... Zimbabwe

$landmass
[1] 5 3 4 6 1 2

$zone
[1] 1 3 2 4

$area
[1] 648 29 2388 0 1247 2777 7690 84 19 1 143 31 23 113
[15] 47 1099 600 8512 6 111 274 678 28 474 9976 4 623 1284
[29] 757 9561 1139 2 342 51 115 9 128 43 22 49 284 1001
[43] 21 1222 12 18 337 547 91 268 10 108 249 239 132 2176
[57] 109 246 36 215 112 93 103 3268 1904 1648 435 70 301 323
[71] 11 372 98 181 583 236 30 1760 3 587 118 333 1240 1031
[85] 1973 1566 447 783 140 41 1267 925 121 195 324 212 804 76
[99] 463 407 1285 300 313 92 237 26 2150 196 72 637 1221 99
[113] 288 505 66 2506 63 17 450 185 945 514 57 5 164 781
[127] 245 178 9363 22402 15 912 256 905 753 391

$population
[1] 16 3 20 0 7 28 15 8 90 10 1 6 119 9 35 4 24
[18] 2 11 1008 5 47 31 54 17 61 14 684 157 39 57 118 13 77
[35] 12 56 18 84 48 36 22 29 38 49 45 231 274 60

$language
[1] 10 6 8 1 2 4 3 5 7 9

$religion
[1] 2 6 1 0 5 3 4 7

$bars
[1] 0 2 3 1 5

$stripes
[1] 3 0 2 1 5 9 11 14 4 6 13 7

$colours
[1] 5 3 2 8 6 4 7 1

$red
[1] 1 0

$green
[1] 1 0

$blue
[1] 0 1

$gold
[1] 1 0

$white
[1] 1 0

$black
[1] 1 0

$orange
[1] 0 1

$mainhue
[1] green red blue gold white orange black brown
Levels: black blue brown gold green orange red white

$circles
[1] 0 1 4 2

$crosses
[1] 0 1 2

$saltires
[1] 0 1

$quarters
[1] 0 1 4

$sunstars
[1] 1 0 6 22 14 3 4 5 15 10 7 2 9 50

$crescent
[1] 0 1

$triangle
[1] 0 1

$icon
[1] 1 0

$animate
[1] 0 1

$text
[1] 0 1

$topleft
[1] black red green blue white orange gold
Levels: black blue gold green orange red white

$botright
[1] green red white black blue gold orange brown
Levels: black blue brown gold green orange red white

| You got it right!

|================ | 20%
| What if you had forgotten how unique() works and mistakenly thought it returns the
| number of unique values contained in the object passed to it? Then you might have
| incorrectly expected sapply(flags, unique) to return a numeric vector, since each
| element of the list returned would contain a single number and sapply() could then
| simplify the result to a vector.

...

|=================== | 24%
| When working interactively (at the prompt), this is not much of a problem, since you
| see the result immediately and will quickly recognize your mistake. However, when
| working non-interactively (e.g. writing your own functions), a misunderstanding may go
| undetected and cause incorrect results later on. Therefore, you may wish to be more
| careful and that's where vapply() is useful.

...

|====================== | 28%
| Whereas sapply() tries to 'guess' the correct format of the result, vapply() allows you
| to specify it explicitly. If the result doesn't match the format you specify, vapply()
| will throw an error, causing the operation to stop. This can prevent significant
| problems in your code that might be caused by getting unexpected return values from
| sapply().

...

|========================== | 32%
| Try vapply(flags, unique, numeric(1)), which says that you expect each element of the
| result to be a numeric vector of length 1. Since this is NOT actually the case, YOU
| WILL GET AN ERROR. Once you get the error, type ok() to continue to the next question.

vapply(flags, unique, numeric(1))
Error in vapply(flags, unique, numeric(1)) : values must be length 1,
but FUN(X[[1]]) result is length 194
ok()

| That's correct!

|============================= | 36%
| Recall from the previous lesson that sapply(flags, class) will return a character
| vector containing the class of each column in the dataset. Try that again now to see
| the result.

sapply(flags, class)
name landmass zone area population language religion bars
"factor" "integer" "integer" "integer" "integer" "integer" "integer" "integer"
stripes colours red green blue gold white black
"integer" "integer" "integer" "integer" "integer" "integer" "integer" "integer"
orange mainhue circles crosses saltires quarters sunstars crescent
"integer" "factor" "integer" "integer" "integer" "integer" "integer" "integer"
triangle icon animate text topleft botright
"integer" "integer" "integer" "integer" "factor" "factor"

| That's a job well done!

|================================ | 40%
| If we wish to be explicit about the format of the result we expect, we can use
| vapply(flags, class, character(1)). The 'character(1)' argument tells R that we expect
| the class function to return a character vector of length 1 when applied to EACH column
| of the flags dataset. Try it now.

vapply(flags, class,character(1))
name landmass zone area population language religion bars
"factor" "integer" "integer" "integer" "integer" "integer" "integer" "integer"
stripes colours red green blue gold white black
"integer" "integer" "integer" "integer" "integer" "integer" "integer" "integer"
orange mainhue circles crosses saltires quarters sunstars crescent
"integer" "factor" "integer" "integer" "integer" "integer" "integer" "integer"
triangle icon animate text topleft botright
"integer" "integer" "integer" "integer" "factor" "factor"

| You are quite good my friend!

|=================================== | 44%
| Note that since our expectation was correct (i.e. character(1)), the vapply() result is
| identical to the sapply() result -- a character vector of column classes.

...

|====================================== | 48%
| You might think of vapply() as being 'safer' than sapply(), since it requires you to
| specify the format of the output in advance, instead of just allowing R to 'guess' what
| you wanted. In addition, vapply() may perform faster than sapply() for large datasets.
| However, when doing data analysis interactively (at the prompt), sapply() saves you
| some typing and will often be good enough.

...

|========================================== | 52%
| As a data analyst, you'll often wish to split your data up into groups based on the
| value of some variable, then apply a function to the members of each group. The next
| function we'll look at, tapply(), does exactly that.

...

|============================================= | 56%
| Use ?tapply to pull up the documentation.

?tapply

| All that hard work is paying off!

|================================================ | 60%
| The 'landmass' variable in our dataset takes on integer values between 1 and 6, each of
| which represents a different part of the world. Use table(flags$landmass) to see how
| many flags/countries fall into each group.

table(flags$landmass)

1 2 3 4 5 6
31 17 35 52 39 20

| You're the best!

|=================================================== | 64%
| The 'animate' variable in our dataset takes the value 1 if a country's flag contains an
| animate image (e.g. an eagle, a tree, a human hand) and 0 otherwise. Use
| table(flags$animate) to see how many flags contain an animate image.

table(flags$animate)

0 1
155 39

| Nice work!

|====================================================== | 68%
| This tells us that 39 flags contain an animate object (animate = 1) and 155 do not
| (animate = 0).

...

|========================================================== | 72%
| If you take the arithmetic mean of a bunch of 0s and 1s, you get the proportion of 1s.
| Use tapply(flags$animate, flags$landmass, mean) to apply the mean function to the
| 'animate' variable separately for each of the six landmass groups, thus giving us the
| proportion of flags containing an animate image WITHIN each landmass group.

tapply(flags$animate, flags$landmass, mean)
1 2 3 4 5 6
0.4193548 0.1764706 0.1142857 0.1346154 0.1538462 0.3000000

| You are doing so well!

|============================================================= | 76%
| The first landmass group (landmass = 1) corresponds to North America and contains the
| highest proportion of flags with an animate image (0.4194).

...

|================================================================ | 80%
| Similarly, we can look at a summary of population values (in round millions) for
| countries with and without the color red on their flag with tapply(flags$population, | flags$red, summary).

tapply(flags$population,flags$red, summary)
$0
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.00 0.00 3.00 27.63 9.00 684.00

$1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0 0.0 4.0 22.1 15.0 1008.0

| Great job!

|=================================================================== | 84%
| What is the median population (in millions) for countries without the color red on
| their flag?

1: 22.1
2: 27.6
3: 4.0
4: 3.0
5: 0.0
6: 9.0

Selection: 4

| That's correct!

|====================================================================== | 88%
| Lastly, use the same approach to look at a summary of population values for each of the
| six landmasses.

tapply(flags$population,flags$landmass, summary)
$1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.00 0.00 0.00 12.29 4.50 231.00

$2
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.00 1.00 6.00 15.71 15.00 119.00

$3
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.00 0.00 8.00 13.86 16.00 61.00

$4
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000 1.000 5.000 8.788 9.750 56.000

$5
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.00 2.00 10.00 69.18 39.00 1008.00

$6
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.00 0.00 0.00 11.30 1.25 157.00

| You are doing so well!

|========================================================================== | 92%
| What is the maximum population (in millions) for the fourth landmass group (Africa)?

1: 5.00
2: 56.00
3: 157.00
4: 119.0
5: 1010.0

Selection: 2

| Nice work!

|============================================================================= | 96%
| In this lesson, you learned how to use vapply() as a safer alternative to sapply(),
| which is most helpful when writing your own functions. You also learned how to use
| tapply() to split your data into groups based on the value of some variable, then apply
| a function to each group. These functions will come in handy on your quest to become a
| better data analyst.

...

|================================================================================| 100%
| Would you like to receive credit for completing this course on Coursera.org?

1: Yes
2: No

Selection: 1
What is your email address? xxxxxx@xxxxxxxxxxxx
What is your assignment token? xXxXxxXXxXxxXXXx
Grade submission succeeded!

| That's the answer I was looking for.

| You've reached the end of this lesson! Returning to the main menu...

| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 0

| Leaving swirl now. Type swirl() to resume.

ls()
[1] "flags" "ok" "viewinfo"
rm(list=ls())

Last updated 2020-04-18 20:41:25.391453 IST

lapply and sapply


swirl()

| Welcome to swirl! Please sign in. If you've been here before, use the same name as you
| did then. If you are new, call yourself something unique.

What shall I call you? Krishnakanth Allika

| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 1

| Please choose a lesson, or type 0 to return to course menu.

1: Basic Building Blocks 2: Workspace and Files 3: Sequences of Numbers
4: Vectors 5: Missing Values 6: Subsetting Vectors
7: Matrices and Data Frames 8: Logic 9: Functions
10: lapply and sapply 11: vapply and tapply 12: Looking at Data
13: Simulation 14: Dates and Times 15: Base Graphics

Selection: 10

| | 0%

| In this lesson, you'll learn how to use lapply() and sapply(), the two most important
| members of R's *apply family of functions, also known as loop functions.

...

|== | 2%
| These powerful functions, along with their close relatives (vapply() and tapply(),
| among others) offer a concise and convenient means of implementing the
| Split-Apply-Combine strategy for data analysis.

...

|=== | 4%
| Each of the *apply functions will SPLIT up some data into smaller pieces, APPLY a
| function to each piece, then COMBINE the results. A more detailed discussion of this
| strategy is found in Hadley Wickham's Journal of Statistical Software paper titled 'The
| Split-Apply-Combine Strategy for Data Analysis'.

...

|===== | 6%
| Throughout this lesson, we'll use the Flags dataset from the UCI Machine Learning
| Repository. This dataset contains details of various nations and their flags. More
| information may be found here: http://archive.ics.uci.edu/ml/datasets/Flags

...

|====== | 8%
| Let's jump right in so you can get a feel for how these special functions work!

...

|======== | 10%
| I've stored the dataset in a variable called flags. Type head(flags) to preview the
| first six lines (i.e. the 'head') of the dataset.

head(flags)
name landmass zone area population language religion bars stripes colours red
1 Afghanistan 5 1 648 16 10 2 0 3 5 1
2 Albania 3 1 29 3 6 6 0 0 3 1
3 Algeria 4 1 2388 20 8 2 2 0 3 1
4 American-Samoa 6 3 0 0 1 1 0 0 5 1
5 Andorra 3 1 0 0 6 0 3 0 3 1
6 Angola 4 2 1247 7 10 5 0 2 3 1
green blue gold white black orange mainhue circles crosses saltires quarters sunstars
1 1 0 1 1 1 0 green 0 0 0 0 1
2 0 0 1 0 1 0 red 0 0 0 0 1
3 1 0 0 1 0 0 green 0 0 0 0 1
4 0 1 1 1 0 1 blue 0 0 0 0 0
5 0 1 1 0 0 0 gold 0 0 0 0 0
6 0 0 1 0 1 0 red 0 0 0 0 1
crescent triangle icon animate text topleft botright
1 0 0 1 0 0 black green
2 0 0 0 1 0 red red
3 1 0 0 0 0 green white
4 0 1 1 1 0 blue red
5 0 0 0 0 0 blue red
6 0 0 1 0 0 red black

| Keep up the great work!

|========== | 12%
| You may need to scroll up to see all of the output. Now, let's check out the dimensions
| of the dataset using dim(flags).

dim(flags)
[1] 194 30

| Perseverance, that's the answer.

|=========== | 14%
| This tells us that there are 194 rows, or observations, and 30 columns, or variables.
| Each observation is a country and each variable describes some characteristic of that
| country or its flag. To open a more complete description of the dataset in a separate
| text file, type viewinfo() when you are back at the prompt (>).

...

|============= | 16%
| As with any dataset, we'd like to know in what format the variables have been stored.
| In other words, what is the 'class' of each variable? What happens if we do
| class(flags)? Try it out.

class(flags)
[1] "data.frame"

| Nice work!

|============== | 18%
| That just tells us that the entire dataset is stored as a 'data.frame', which doesn't
| answer our question. What we really need is to call the class() function on each
| individual column. While we could do this manually (i.e. one column at a time) it's
| much faster if we can automate the process. Sounds like a loop!

...

|================ | 20%
| The lapply() function takes a list as input, applies a function to each element of the
| list, then returns a list of the same length as the original one. Since a data frame is
| really just a list of vectors (you can see this with as.list(flags)), we can use
| lapply() to apply the class() function to each column of the flags dataset. Let's see
| it in action!

...

|================== | 22%
| Type cls_list <- lapply(flags, class) to apply the class() function to each column of
| the flags dataset and store the result in a variable called cls_list. Note that you
| just supply the name of the function you want to apply (i.e. class), without the usual
| parentheses after it.


cls_list<-lapply(flags,class)

| Excellent work!

|=================== | 24%
| Type cls_list to view the result.

cls_list
$name
[1] "factor"

$landmass
[1] "integer"

$zone
[1] "integer"

$area
[1] "integer"

$population
[1] "integer"

$language
[1] "integer"

$religion
[1] "integer"

$bars
[1] "integer"

$stripes
[1] "integer"

$colours
[1] "integer"

$red
[1] "integer"

$green
[1] "integer"

$blue
[1] "integer"

$gold
[1] "integer"

$white
[1] "integer"

$black
[1] "integer"

$orange
[1] "integer"

$mainhue
[1] "factor"

$circles
[1] "integer"

$crosses
[1] "integer"

$saltires
[1] "integer"

$quarters
[1] "integer"

$sunstars
[1] "integer"

$crescent
[1] "integer"

$triangle
[1] "integer"

$icon
[1] "integer"

$animate
[1] "integer"

$text
[1] "integer"

$topleft
[1] "factor"

$botright
[1] "factor"

| You are really on a roll!

|===================== | 26%
| The 'l' in 'lapply' stands for 'list'. Type class(cls_list) to confirm that lapply()
| returned a list.

play()

| Entering play mode. Experiment as you please, then type nxt() when you are ready to
| resume the lesson.

str(cls_list)
List of 30
$ name : chr "factor" $ landmass : chr "integer"
$ zone : chr "integer" $ area : chr "integer"
$ population: chr "integer" $ language : chr "integer"
$ religion : chr "integer" $ bars : chr "integer"
$ stripes : chr "integer" $ colours : chr "integer"
$ red : chr "integer" $ green : chr "integer"
$ blue : chr "integer" $ gold : chr "integer"
$ white : chr "integer" $ black : chr "integer"
$ orange : chr "integer" $ mainhue : chr "factor"
$ circles : chr "integer" $ crosses : chr "integer"
$ saltires : chr "integer" $ quarters : chr "integer"
$ sunstars : chr "integer" $ crescent : chr "integer"
$ triangle : chr "integer" $ icon : chr "integer"
$ animate : chr "integer" $ text : chr "integer"
$ topleft : chr "factor" $ botright : chr "factor"
cls_list[1,]
Error in cls_list[1, ] : incorrect number of dimensions
as.data.frame(cls_list)[1,]
name landmass zone area population language religion bars stripes colours
1 factor integer integer integer integer integer integer integer integer integer
red green blue gold white black orange mainhue circles crosses
1 integer integer integer integer integer integer integer factor integer integer
saltires quarters sunstars crescent triangle icon animate text topleft botright
1 integer integer integer integer integer integer integer integer factor factor
nxt()

| Resuming lesson...

| The 'l' in 'lapply' stands for 'list'. Type class(cls_list) to confirm that lapply()
| returned a list.

class(cls_list)
[1] "list"

| You got it!

|====================== | 28%
| As expected, we got a list of length 30 -- one element for each variable/column. The
| output would be considerably more compact if we could represent it as a vector instead
| of a list.

...

|======================== | 30%
| You may remember from a previous lesson that lists are most helpful for storing
| multiple classes of data. In this case, since every element of the list returned by
| lapply() is a character vector of length one (i.e. "integer" and "vector"), cls_list
| can be simplified to a character vector. To do this manually, type
| as.character(cls_list).

as.character(cls_list)
[1] "factor" "integer" "integer" "integer" "integer" "integer" "integer" "integer"
[9] "integer" "integer" "integer" "integer" "integer" "integer" "integer" "integer"
[17] "integer" "factor" "integer" "integer" "integer" "integer" "integer" "integer"
[25] "integer" "integer" "integer" "integer" "factor" "factor"

| You nailed it! Good job!

|========================== | 32%
| sapply() allows you to automate this process by calling lapply() behind the scenes, but
| then attempting to simplify (hence the 's' in 'sapply') the result for you. Use
| sapply() the same way you used lapply() to get the class of each column of the flags
| dataset and store the result in cls_vect. If you need help, type ?sapply to bring up
| the documentation.

cls_vect<-sappy(flags,class)
Error in sappy(flags, class) : could not find function "sappy"
cls_vect<-sapply(flags,class)

| That's a job well done!

|=========================== | 34%
| Use class(cls_vect) to confirm that sapply() simplified the result to a character
| vector.

class(cls_vect)
[1] "character"

| You're the best!

|============================= | 36%
| In general, if the result is a list where every element is of length one, then sapply()
| returns a vector. If the result is a list where every element is a vector of the same
| length (> 1), sapply() returns a matrix. If sapply() can't figure things out, then it
| just returns a list, no different from what lapply() would give you.

...

|============================== | 38%
| Let's practice using lapply() and sapply() some more!

...

|================================ | 40%
| Columns 11 through 17 of our dataset are indicator variables, each representing a
| different color. The value of the indicator variable is 1 if the color is present in a
| country's flag and 0 otherwise.

...

|================================== | 42%
| Therefore, if we want to know the total number of countries (in our dataset) with, for
| example, the color orange on their flag, we can just add up all of the 1s and 0s in the
| 'orange' column. Try sum(flags$orange) to see this.

sum(flags$orange)
[1] 26

| You are doing so well!

|=================================== | 44%
| Now we want to repeat this operation for each of the colors recorded in the dataset.

...

|===================================== | 46%
| First, use flag_colors <- flags[, 11:17] to extract the columns containing the color
| data and store them in a new data frame called flag_colors. (Note the comma before
| 11:17. This subsetting command tells R that we want all rows, but only columns 11
| through 17.)

flag_colors<-flags[,11:17]

| All that hard work is paying off!

|====================================== | 48%
| Use the head() function to look at the first 6 lines of flag_colors.

head(flag_colors)
red green blue gold white black orange
1 1 1 0 1 1 1 0
2 1 0 0 1 0 1 0
3 1 1 0 0 1 0 0
4 1 0 1 1 1 0 1
5 1 0 1 1 0 0 0
6 1 0 0 1 0 1 0

| You got it right!

|======================================== | 50%
| To get a list containing the sum of each column of flag_colors, call the lapply()
| function with two arguments. The first argument is the object over which we are looping
| (i.e. flag_colors) and the second argument is the name of the function we wish to apply
| to each column (i.e. sum). Remember that the second argument is just the name of the
| function with no parentheses, etc.

lapply(flag_colors,aum)
Error in match.fun(FUN) : object 'aum' not found
lapply(flag_colors,sum)
$red
[1] 153

$green
[1] 91

$blue
[1] 99

$gold
[1] 91

$white
[1] 146

$black
[1] 52

$orange
[1] 26

| All that hard work is paying off!

|========================================== | 52%
| This tells us that of the 194 flags in our dataset, 153 contain the color red, 91
| contain green, 99 contain blue, and so on.

...

|=========================================== | 54%
| The result is a list, since lapply() always returns a list. Each element of this list
| is of length one, so the result can be simplified to a vector by calling sapply()
| instead of lapply(). Try it now.

sapply(flag_colors,sum)
red green blue gold white black orange
153 91 99 91 146 52 26

| All that hard work is paying off!

|============================================= | 56%
| Perhaps it's more informative to find the proportion of flags (out of 194) containing
| each color. Since each column is just a bunch of 1s and 0s, the arithmetic mean of each
| column will give us the proportion of 1s. (If it's not clear why, think of a simpler
| situation where you have three 1s and two 0s -- (1 + 1 + 1 + 0 + 0)/5 = 3/5 = 0.6).

...

|============================================== | 58%
| Use sapply() to apply the mean() function to each column of flag_colors. Remember that
| the second argument to sapply() should just specify the name of the function (i.e.
| mean) that you want to apply.

sapply(flag_colors,mean)
red green blue gold white black orange
0.7886598 0.4690722 0.5103093 0.4690722 0.7525773 0.2680412 0.1340206

| That's the answer I was looking for.

|================================================ | 60%
| In the examples we've looked at so far, sapply() has been able to simplify the result
| to vector. That's because each element of the list returned by lapply() was a vector of
| length one. Recall that sapply() instead returns a matrix when each element of the list
| returned by lapply() is a vector of the same length (> 1).

...

|================================================== | 62%
| To illustrate this, let's extract columns 19 through 23 from the flags dataset and
| store the result in a new data frame called flag_shapes. flag_shapes <- flags[, 19:23]
| will do it.

flag_shapes<-flags[,19:23]

| Keep up the great work!

|=================================================== | 64%
| Each of these columns (i.e. variables) represents the number of times a particular
| shape or design appears on a country's flag. We are interested in the minimum and
| maximum number of times each shape or design appears.

...

|===================================================== | 66%
| The range() function returns the minimum and maximum of its first argument, which
| should be a numeric vector. Use lapply() to apply the range function to each column of
| flag_shapes. Don't worry about storing the result in a new variable. By now, we know
| that lapply() always returns a list.

lapply(flag_shapes,range)
$circles
[1] 0 4

$crosses
[1] 0 2

$saltires
[1] 0 1

$quarters
[1] 0 4

$sunstars
[1] 0 50

| You are quite good my friend!

|====================================================== | 68%
| Do the same operation, but using sapply() and store the result in a variable called
| shape_mat.

shape_mat<-sapply(flag_shapes,range)

| Nice work!

|======================================================== | 70%
| View the contents of shape_mat.

shape_mat
circles crosses saltires quarters sunstars
[1,] 0 0 0 0 0
[2,] 4 2 1 4 50

| Perseverance, that's the answer.

|========================================================== | 72%
| Each column of shape_mat gives the minimum (row 1) and maximum (row 2) number of times
| its respective shape appears in different flags.

...

|=========================================================== | 74%
| Use the class() function to confirm that shape_mat is a matrix.

class(shape_mat)
[1] "matrix"

| Excellent job!

|============================================================= | 76%
| As we've seen, sapply() always attempts to simplify the result given by lapply(). It
| has been successful in doing so for each of the examples we've looked at so far. Let's
| look at an example where sapply() can't figure out how to simplify the result and thus
| returns a list, no different from lapply().

...

|============================================================== | 78%
| When given a vector, the unique() function returns a vector with all duplicate elements
| removed. In other words, unique() returns a vector of only the 'unique' elements. To
| see how it works, try unique(c(3, 4, 5, 5, 5, 6, 6)).

play()

| Entering play mode. Experiment as you please, then type nxt() when you are ready to
| resume the lesson.

as.data.frame(shape_mat)
circles crosses saltires quarters sunstars
1 0 0 0 0 0
2 4 2 1 4 50
nxt()

| Resuming lesson...

| When given a vector, the unique() function returns a vector with all duplicate elements
| removed. In other words, unique() returns a vector of only the 'unique' elements. To
| see how it works, try unique(c(3, 4, 5, 5, 5, 6, 6)).

unique(c(3, 4, 5, 5, 5, 6, 6))
[1] 3 4 5 6

| Your dedication is inspiring!

|================================================================ | 80%
| We want to know the unique values for each variable in the flags dataset. To accomplish
| this, use lapply() to apply the unique() function to each column in the flags dataset,
| storing the result in a variable called unique_vals.

unique_vals<-lapply(flags,unique)

| Keep working like that and you'll get there!

|================================================================== | 82%
| Print the value of unique_vals to the console.

unique_vals
$name
[1] Afghanistan Albania Algeria
[4] American-Samoa Andorra Angola
[7] Anguilla Antigua-Barbuda Argentina
[10] Argentine Australia Austria
[13] Bahamas Bahrain Bangladesh
[16] Barbados Belgium Belize
[19] Benin Bermuda Bhutan
[22] Bolivia Botswana Brazil
[25] British-Virgin-Isles Brunei Bulgaria
[28] Burkina Burma Burundi
[31] Cameroon Canada Cape-Verde-Islands
[34] Cayman-Islands Central-African-Republic Chad
[37] Chile China Colombia
[40] Comorro-Islands Congo Cook-Islands
[43] Costa-Rica Cuba Cyprus
[46] Czechoslovakia Denmark Djibouti
[49] Dominica Dominican-Republic Ecuador
[52] Egypt El-Salvador Equatorial-Guinea
[55] Ethiopia Faeroes Falklands-Malvinas
[58] Fiji Finland France
[61] French-Guiana French-Polynesia Gabon
[64] Gambia Germany-DDR Germany-FRG
[67] Ghana Gibraltar Greece
[70] Greenland Grenada Guam
[73] Guatemala Guinea Guinea-Bissau
[76] Guyana Haiti Honduras
[79] Hong-Kong Hungary Iceland
[82] India Indonesia Iran
[85] Iraq Ireland Israel
[88] Italy Ivory-Coast Jamaica
[91] Japan Jordan Kampuchea
[94] Kenya Kiribati Kuwait
[97] Laos Lebanon Lesotho
[100] Liberia Libya Liechtenstein
[103] Luxembourg Malagasy Malawi
[106] Malaysia Maldive-Islands Mali
[109] Malta Marianas Mauritania
[112] Mauritius Mexico Micronesia
[115] Monaco Mongolia Montserrat
[118] Morocco Mozambique Nauru
[121] Nepal Netherlands Netherlands-Antilles
[124] New-Zealand Nicaragua Niger
[127] Nigeria Niue North-Korea
[130] North-Yemen Norway Oman
[133] Pakistan Panama Papua-New-Guinea
[136] Parguay Peru Philippines
[139] Poland Portugal Puerto-Rico
[142] Qatar Romania Rwanda
[145] San-Marino Sao-Tome Saudi-Arabia
[148] Senegal Seychelles Sierra-Leone
[151] Singapore Soloman-Islands Somalia
[154] South-Africa South-Korea South-Yemen
[157] Spain Sri-Lanka St-Helena
[160] St-Kitts-Nevis St-Lucia St-Vincent
[163] Sudan Surinam Swaziland
[166] Sweden Switzerland Syria
[169] Taiwan Tanzania Thailand
[172] Togo Tonga Trinidad-Tobago
[175] Tunisia Turkey Turks-Cocos-Islands
[178] Tuvalu UAE Uganda
[181] UK Uruguay US-Virgin-Isles
[184] USA USSR Vanuatu
[187] Vatican-City Venezuela Vietnam
[190] Western-Samoa Yugoslavia Zaire
[193] Zambia Zimbabwe
194 Levels: Afghanistan Albania Algeria American-Samoa Andorra Angola ... Zimbabwe

$landmass
[1] 5 3 4 6 1 2

$zone
[1] 1 3 2 4

$area
[1] 648 29 2388 0 1247 2777 7690 84 19 1 143 31 23 113
[15] 47 1099 600 8512 6 111 274 678 28 474 9976 4 623 1284
[29] 757 9561 1139 2 342 51 115 9 128 43 22 49 284 1001
[43] 21 1222 12 18 337 547 91 268 10 108 249 239 132 2176
[57] 109 246 36 215 112 93 103 3268 1904 1648 435 70 301 323
[71] 11 372 98 181 583 236 30 1760 3 587 118 333 1240 1031
[85] 1973 1566 447 783 140 41 1267 925 121 195 324 212 804 76
[99] 463 407 1285 300 313 92 237 26 2150 196 72 637 1221 99
[113] 288 505 66 2506 63 17 450 185 945 514 57 5 164 781
[127] 245 178 9363 22402 15 912 256 905 753 391

$population
[1] 16 3 20 0 7 28 15 8 90 10 1 6 119 9 35 4 24
[18] 2 11 1008 5 47 31 54 17 61 14 684 157 39 57 118 13 77
[35] 12 56 18 84 48 36 22 29 38 49 45 231 274 60

$language
[1] 10 6 8 1 2 4 3 5 7 9

$religion
[1] 2 6 1 0 5 3 4 7

$bars
[1] 0 2 3 1 5

$stripes
[1] 3 0 2 1 5 9 11 14 4 6 13 7

$colours
[1] 5 3 2 8 6 4 7 1

$red
[1] 1 0

$green
[1] 1 0

$blue
[1] 0 1

$gold
[1] 1 0

$white
[1] 1 0

$black
[1] 1 0

$orange
[1] 0 1

$mainhue
[1] green red blue gold white orange black brown
Levels: black blue brown gold green orange red white

$circles
[1] 0 1 4 2

$crosses
[1] 0 1 2

$saltires
[1] 0 1

$quarters
[1] 0 1 4

$sunstars
[1] 1 0 6 22 14 3 4 5 15 10 7 2 9 50

$crescent
[1] 0 1

$triangle
[1] 0 1

$icon
[1] 1 0

$animate
[1] 0 1

$text
[1] 0 1

$topleft
[1] black red green blue white orange gold
Levels: black blue gold green orange red white

$botright
[1] green red white black blue gold orange brown
Levels: black blue brown gold green orange red white

| Your dedication is inspiring!

|=================================================================== | 84%
| Since unique_vals is a list, you can use what you've learned to determine the length of
| each element of unique_vals (i.e. the number of unique values for each variable).
| Simplify the result, if possible. Hint: Apply the length() function to each element of
| unique_vals.

sapply(unique_vals,length)
name landmass zone area population language religion bars
194 6 4 136 48 10 8 5
stripes colours red green blue gold white black
12 8 2 2 2 2 2 2
orange mainhue circles crosses saltires quarters sunstars crescent
2 8 4 3 2 3 14 2
triangle icon animate text topleft botright
2 2 2 2 7 8

| You got it right!

|===================================================================== | 86%
| The fact that the elements of the unique_vals list are all vectors of different
| length poses a problem for sapply(), since there's no obvious way of simplifying the
| result.

...

|====================================================================== | 88%
| Use sapply() to apply the unique() function to each column of the flags dataset to see
| that you get the same unsimplified list that you got from lapply().

sapply(flags,unique)
$name
[1] Afghanistan Albania Algeria
[4] American-Samoa Andorra Angola
[7] Anguilla Antigua-Barbuda Argentina
[10] Argentine Australia Austria
[13] Bahamas Bahrain Bangladesh
[16] Barbados Belgium Belize
[19] Benin Bermuda Bhutan
[22] Bolivia Botswana Brazil
[25] British-Virgin-Isles Brunei Bulgaria
[28] Burkina Burma Burundi
[31] Cameroon Canada Cape-Verde-Islands
[34] Cayman-Islands Central-African-Republic Chad
[37] Chile China Colombia
[40] Comorro-Islands Congo Cook-Islands
[43] Costa-Rica Cuba Cyprus
[46] Czechoslovakia Denmark Djibouti
[49] Dominica Dominican-Republic Ecuador
[52] Egypt El-Salvador Equatorial-Guinea
[55] Ethiopia Faeroes Falklands-Malvinas
[58] Fiji Finland France
[61] French-Guiana French-Polynesia Gabon
[64] Gambia Germany-DDR Germany-FRG
[67] Ghana Gibraltar Greece
[70] Greenland Grenada Guam
[73] Guatemala Guinea Guinea-Bissau
[76] Guyana Haiti Honduras
[79] Hong-Kong Hungary Iceland
[82] India Indonesia Iran
[85] Iraq Ireland Israel
[88] Italy Ivory-Coast Jamaica
[91] Japan Jordan Kampuchea
[94] Kenya Kiribati Kuwait
[97] Laos Lebanon Lesotho
[100] Liberia Libya Liechtenstein
[103] Luxembourg Malagasy Malawi
[106] Malaysia Maldive-Islands Mali
[109] Malta Marianas Mauritania
[112] Mauritius Mexico Micronesia
[115] Monaco Mongolia Montserrat
[118] Morocco Mozambique Nauru
[121] Nepal Netherlands Netherlands-Antilles
[124] New-Zealand Nicaragua Niger
[127] Nigeria Niue North-Korea
[130] North-Yemen Norway Oman
[133] Pakistan Panama Papua-New-Guinea
[136] Parguay Peru Philippines
[139] Poland Portugal Puerto-Rico
[142] Qatar Romania Rwanda
[145] San-Marino Sao-Tome Saudi-Arabia
[148] Senegal Seychelles Sierra-Leone
[151] Singapore Soloman-Islands Somalia
[154] South-Africa South-Korea South-Yemen
[157] Spain Sri-Lanka St-Helena
[160] St-Kitts-Nevis St-Lucia St-Vincent
[163] Sudan Surinam Swaziland
[166] Sweden Switzerland Syria
[169] Taiwan Tanzania Thailand
[172] Togo Tonga Trinidad-Tobago
[175] Tunisia Turkey Turks-Cocos-Islands
[178] Tuvalu UAE Uganda
[181] UK Uruguay US-Virgin-Isles
[184] USA USSR Vanuatu
[187] Vatican-City Venezuela Vietnam
[190] Western-Samoa Yugoslavia Zaire
[193] Zambia Zimbabwe
194 Levels: Afghanistan Albania Algeria American-Samoa Andorra Angola ... Zimbabwe

$landmass
[1] 5 3 4 6 1 2

$zone
[1] 1 3 2 4

$area
[1] 648 29 2388 0 1247 2777 7690 84 19 1 143 31 23 113
[15] 47 1099 600 8512 6 111 274 678 28 474 9976 4 623 1284
[29] 757 9561 1139 2 342 51 115 9 128 43 22 49 284 1001
[43] 21 1222 12 18 337 547 91 268 10 108 249 239 132 2176
[57] 109 246 36 215 112 93 103 3268 1904 1648 435 70 301 323
[71] 11 372 98 181 583 236 30 1760 3 587 118 333 1240 1031
[85] 1973 1566 447 783 140 41 1267 925 121 195 324 212 804 76
[99] 463 407 1285 300 313 92 237 26 2150 196 72 637 1221 99
[113] 288 505 66 2506 63 17 450 185 945 514 57 5 164 781
[127] 245 178 9363 22402 15 912 256 905 753 391

$population
[1] 16 3 20 0 7 28 15 8 90 10 1 6 119 9 35 4 24
[18] 2 11 1008 5 47 31 54 17 61 14 684 157 39 57 118 13 77
[35] 12 56 18 84 48 36 22 29 38 49 45 231 274 60

$language
[1] 10 6 8 1 2 4 3 5 7 9

$religion
[1] 2 6 1 0 5 3 4 7

$bars
[1] 0 2 3 1 5

$stripes
[1] 3 0 2 1 5 9 11 14 4 6 13 7

$colours
[1] 5 3 2 8 6 4 7 1

$red
[1] 1 0

$green
[1] 1 0

$blue
[1] 0 1

$gold
[1] 1 0

$white
[1] 1 0

$black
[1] 1 0

$orange
[1] 0 1

$mainhue
[1] green red blue gold white orange black brown
Levels: black blue brown gold green orange red white

$circles
[1] 0 1 4 2

$crosses
[1] 0 1 2

$saltires
[1] 0 1

$quarters
[1] 0 1 4

$sunstars
[1] 1 0 6 22 14 3 4 5 15 10 7 2 9 50

$crescent
[1] 0 1

$triangle
[1] 0 1

$icon
[1] 1 0

$animate
[1] 0 1

$text
[1] 0 1

$topleft
[1] black red green blue white orange gold
Levels: black blue gold green orange red white

$botright
[1] green red white black blue gold orange brown
Levels: black blue brown gold green orange red white

| You're the best!

|======================================================================== | 90%
| Occasionally, you may need to apply a function that is not yet defined, thus requiring
| you to write your own. Writing functions in R is beyond the scope of this lesson, but
| let's look at a quick example of how you might do so in the context of loop functions.

...

|========================================================================== | 92%
| Pretend you are interested in only the second item from each element of the unique_vals
| list that you just created. Since each element of the unique_vals list is a vector and
| we're not aware of any built-in function in R that returns the second element of a
| vector, we will construct our own function.

...

|=========================================================================== | 94%
| lapply(unique_vals, function(elem) elem[2]) will return a list containing the second
| item from each element of the unique_vals list. Note that our function takes one
| argument, elem, which is just a 'dummy variable' that takes on the value of each
| element of unique_vals, in turn.

lapply(unique_vals, function(elem) elem[2])
$name
[1] Albania
194 Levels: Afghanistan Albania Algeria American-Samoa Andorra Angola ... Zimbabwe

$landmass
[1] 3

$zone
[1] 3

$area
[1] 29

$population
[1] 3

$language
[1] 6

$religion
[1] 6

$bars
[1] 2

$stripes
[1] 0

$colours
[1] 3

$red
[1] 0

$green
[1] 0

$blue
[1] 1

$gold
[1] 0

$white
[1] 0

$black
[1] 0

$orange
[1] 1

$mainhue
[1] red
Levels: black blue brown gold green orange red white

$circles
[1] 1

$crosses
[1] 1

$saltires
[1] 1

$quarters
[1] 1

$sunstars
[1] 0

$crescent
[1] 1

$triangle
[1] 1

$icon
[1] 0

$animate
[1] 1

$text
[1] 1

$topleft
[1] red
Levels: black blue gold green orange red white

$botright
[1] red
Levels: black blue brown gold green orange red white

| Great job!

|============================================================================= | 96%
| The only difference between previous examples and this one is that we are defining and
| using our own function right in the call to lapply(). Our function has no name and
| disappears as soon as lapply() is done using it. So-called 'anonymous functions' can be
| very useful when one of R's built-in functions isn't an option.

...

|============================================================================== | 98%
| In this lesson, you learned how to use the powerful lapply() and sapply() functions to
| apply an operation over the elements of a list. In the next lesson, we'll take a look
| at some close relatives of lapply() and sapply().

...

|================================================================================| 100%
| Would you like to receive credit for completing this course on Coursera.org?

1: Yes
2: No

Selection: 1
What is your email address? xxxxxx@xxxxxxxxxxxx
What is your assignment token? xXxXxxXXxXxxXXXx
Grade submission succeeded!

| That's the answer I was looking for.

| You've reached the end of this lesson! Returning to the main menu...

| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 0

| Leaving swirl now. Type swirl() to resume.

ls()
[1] "cls_list" "cls_vect" "flag_colors" "flag_shapes" "flags" "shape_mat"
[7] "unique_vals" "viewinfo"
rm(list=ls())

Last updated 2020-04-18 20:39:08.401287 IST

Functions

swirl()

| Welcome to swirl! Please sign in. If you've been here before, use the same name as you
| did then. If you are new, call yourself something unique.

What shall I call you? Krishnakanth Allika

| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 1

| Please choose a lesson, or type 0 to return to course menu.

1: Basic Building Blocks 2: Workspace and Files 3: Sequences of Numbers
4: Vectors 5: Missing Values 6: Subsetting Vectors
7: Matrices and Data Frames 8: Logic 9: Functions
10: lapply and sapply 11: vapply and tapply 12: Looking at Data
13: Simulation 14: Dates and Times 15: Base Graphics

Selection: 9

| | 0%

| Functions are one of the fundamental building blocks of the R language. They are small
| pieces of reusable code that can be treated like any other R object.

...

|== | 2%
| If you've worked through any other part of this course, you've probably used some
| functions already. Functions are usually characterized by the name of the function
| followed by parentheses.

...

|=== | 4%
| Let's try using a few basic functions just for fun. The Sys.Date() function returns a
| string representing today's date. Type Sys.Date() below and see what happens.

Sys.Date()
[1] "2020-04-15"

| Keep up the great work!

|===== | 6%
| Most functions in R return a value. Functions like Sys.Date() return a value based on
| your computer's environment, while other functions manipulate input data in order to
| compute a return value.

...

|======= | 8%
| The mean() function takes a vector of numbers as input, and returns the average of all
| of the numbers in the input vector. Inputs to functions are often called arguments.
| Providing arguments to a function is also sometimes called passing arguments to that
| function. Arguments you want to pass to a function go inside the function's
| parentheses. Try passing the argument c(2, 4, 5) to the mean() function.

mean(c(2,4,5))
[1] 3.666667

| All that practice is paying off!

|======== | 10%
| Functions usually take arguments which are variables that the function operates on. For
| example, the mean() function takes a vector as an argument, like in the case of
| mean(c(2,6,8)). The mean() function then adds up all of the numbers in the vector and
| divides that sum by the length of the vector.

...

|========== | 12%
| In the following question you will be asked to modify a script that will appear as soon
| as you move on from this question. When you have finished modifying the script, save
| your changes to the script and type submit() and the script will be evaluated. There
| will be some comments in the script that opens up, so be sure to read them!

...

|=========== | 14%
| The last R expression to be evaluated in a function will become the return value of
| that function. We want this function to take one argument, x, and return x without
| modifying it. Delete the pound sign so that x is returned without any modification.
| Make sure to save your script before you type submit().

# You're about to write your first function! Just like you would assign a value 
# to a variable with the assignment operator, you assign functions in the following
# way:
#
# function_name <- function(arg1, arg2){
#   # Manipulate arguments in some way
#   # Return a value
# }
#
# The "variable name" you assign will become the name of your function. arg1 and
# arg2 represent the arguments of your function. You can manipulate the arguments
# you specify within the function. After sourcing the function, you can use the 
# function by typing:
# 
# function_name(value1, value2)
#
# Below we will create a function called boring_function. This function takes
# the argument `x` as input, and returns the value of x without modifying it.
# Delete the pound sign in front of the x to make the function work! Be sure to 
# save this script and type submit() in the console after you make your changes.

boring_function <- function(x) {
  x
}

submit()

| Sourcing your script...

| You got it right!

|============= | 16%
| Now that you've created your first function let's test it! Type: boring_function('My
| first function!'). If your function works, it should just return the string: 'My first
| function!'

boring_function('My first function!')
[1] "My first function!"

| You're the best!

|=============== | 18%
| Congratulations on writing your first function. By writing functions, you can gain
| serious insight into how R works. As John Chambers, the creator of R once said:
|
| To understand computations in R, two slogans are helpful: 1. Everything that exists is
| an object. 2. Everything that happens is a function call.

...

|================ | 20%
| If you want to see the source code for any function, just type the function name
| without any arguments or parentheses. Let's try this out with the function you just
| created. Type: boring_function to view its source code.

boring_function
function(x) {
x
}

<bytecode: 0x00000000190a1b98>

| Keep working like that and you'll get there!

|================== | 22%
| Time to make a more useful function! We're going to replicate the functionality of the
| mean() function by creating a function called: my_mean(). Remember that to calculate
| the average of all of the numbers in a vector you find the sum of all the numbers in
| the vector, and then divide that sum by the number of numbers in the vector.

...

|==================== | 24%
| Make sure to save your script before you type submit().

# You're free to implement the function my_mean however you want, as long as it
# returns the average of all of the numbers in `my_vector`.
#
# Hint #1: sum() returns the sum of a vector.
#   Ex: sum(c(1, 2, 3)) evaluates to 6
#
# Hint #2: length() returns the size of a vector.
#   Ex: length(c(1, 2, 3)) evaluates to 3
#
# Hint #3: The mean of all the numbers in a vector is equal to the sum of all of
#          the numbers in the vector divided by the size of the vector.
#
# Note for those of you feeling super clever: Please do not use the mean()
# function while writing this function. We're trying to teach you something 
# here!
#
# Be sure to save this script and type submit() in the console after you make 
# your changes.

my_mean <- function(my_vector) {
  # Write your code here!
  # Remember: the last expression evaluated will be returned! 
  sum(my_vector)/length(my_vector)
}

submit()

| Sourcing your script...

| You got it right!

|===================== | 27%
| Now test out your my_mean() function by finding the mean of the vector c(4, 5, 10).

my_mean(c(4,5,10))
[1] 6.333333

| You are doing so well!

|======================= | 29%
| Next, let's try writing a function with default arguments. You can set default values
| for a function's arguments, and this can be useful if you think someone who uses your
| function will set a certain argument to the same value most of the time.

...

|======================== | 31%
| Make sure to save your script before you type submit().

# Let me show you an example of a function I'm going to make up called
# increment(). Most of the time I want to use this function to increase the
# value of a number by one. This function will take two arguments: "number" and
# "by" where "number" is the digit I want to increment and "by" is the amount I
# want to increment "number" by. I've written the function below. 
#
# increment <- function(number, by = 1){
#     number + by
# }
#
# If you take a look in between the parentheses you can see that I've set
# "by" equal to 1. This means that the "by" argument will have the default
# value of 1.
#
# I can now use the increment function without providing a value for "by": 
# increment(5) will evaluate to 6. 
#
# However if I want to provide a value for the "by" argument I still can! The
# expression: increment(5, 2) will evaluate to 7. 
# 
# You're going to write a function called "remainder." remainder() will take
# two arguments: "num" and "divisor" where "num" is divided by "divisor" and
# the remainder is returned. Imagine that you usually want to know the remainder
# when you divide by 2, so set the default value of "divisor" to 2. Please be
# sure that "num" is the first argument and "divisor" is the second argument.
#
# Hint #1: You can use the modulus operator %% to find the remainder.
#   Ex: 7 %% 4 evaluates to 3. 
#
# Remember to set appropriate default values! Be sure to save this 
# script and type submit() in the console after you write the function.

remainder <- function(num, divisor=2) {
  # Write your code here!
  # Remember: the last expression evaluated will be returned! 
  num%%divisor
}

submit()

| Sourcing your script...

| Keep working like that and you'll get there!

|========================== | 33%
| Let's do some testing of the remainder function. Run remainder(5) and see what happens.

remainder(5)
[1] 1

| You nailed it! Good job!

|============================ | 35%
| Let's take a moment to examine what just happened. You provided one argument to the
| function, and R matched that argument to 'num' since 'num' is the first argument. The
| default value for 'divisor' is 2, so the function used the default value you provided.

...

|============================= | 37%
| Now let's test the remainder function by providing two arguments. Type: remainder(11,
| 5) and let's see what happens.

remainder(11,5)
[1] 1

| You got it!

|=============================== | 39%
| Once again, the arguments have been matched appropriately.

...

|================================= | 41%
| You can also explicitly specify arguments in a function. When you explicitly designate
| argument values by name, the ordering of the arguments becomes unimportant. You can try
| this out by typing: remainder(divisor = 11, num = 5).

remainder(divisor = 11, num = 5)
[1] 5

| You're the best!

|================================== | 43%
| As you can see, there is a significant difference between remainder(11, 5) and
| remainder(divisor = 11, num = 5)!

...

|==================================== | 45%
| R can also partially match arguments. Try typing remainder(4, div = 2) to see this
| feature in action.

remainder(4, div = 2)
[1] 0

| Excellent work!

|====================================== | 47%
| A word of warning: in general you want to make your code as easy to understand as
| possible. Switching around the orders of arguments by specifying their names or only
| using partial argument names can be confusing, so use these features with caution!

...

|======================================= | 49%
| With all of this talk about arguments, you may be wondering if there is a way you can
| see a function's arguments (besides looking at the documentation). Thankfully, you can
| use the args() function! Type: args(remainder) to examine the arguments for the
| remainder function.

args(remainder)
function (num, divisor = 2)
NULL

| You are quite good my friend!

|========================================= | 51%
| You may not realize it but I just tricked you into doing something pretty interesting!
| args() is a function, remainder() is a function, yet remainder was an argument for
| args(). Yes it's true: you can pass functions as arguments! This is a very powerful
| concept. Let's write a script to see how it works.

...

|========================================== | 53%
| Make sure to save your script before you type submit().

# You can pass functions as arguments to other functions just like you can pass
# data to functions. Let's say you define the following functions:
#
# add_two_numbers <- function(num1, num2){
#    num1 + num2
# }
#
# multiply_two_numbers <- function(num1, num2){
#   num1 * num2
# }
#
# some_function <- function(func){
#    func(2, 4)
# }
#
# As you can see we use the argument name "func" like a function inside of 
# "some_function()." By passing functions as arguments 
# some_function(add_two_numbers) will evaluate to 6, while
# some_function(multiply_two_numbers) will evaluate to 8.
# 
# Finish the function definition below so that if a function is passed into the
# "func" argument and some data (like a vector) is passed into the dat argument
# the evaluate() function will return the result of dat being passed as an
# argument to func.
#
# Hints: This exercise is a little tricky so I'll provide a few example of how
# evaluate() should act:
#    1. evaluate(sum, c(2, 4, 6)) should evaluate to 12
#    2. evaluate(median, c(7, 40, 9)) should evaluate to 9
#    3. evaluate(floor, 11.1) should evaluate to 11

evaluate <- function(func, dat){
  # Write your code here!
  # Remember: the last expression evaluated will be returned! 
  func(dat)
}

submit()

| Sourcing your script...

| You are really on a roll!

|============================================ | 55%
| Let's take your new evaluate() function for a spin! Use evaluate to find the standard
| deviation of the vector c(1.4, 3.6, 7.9, 8.8).

evaluate(sd,c(1.4, 3.6, 7.9, 8.8))
[1] 3.514138

| All that practice is paying off!

|============================================== | 57%
| The idea of passing functions as arguments to other functions is an important and
| fundamental concept in programming.

...

|=============================================== | 59%
| You may be surprised to learn that you can pass a function as an argument without first
| defining the passed function. Functions that are not named are appropriately known as
| anonymous functions.

...

|================================================= | 61%
| Let's use the evaluate function to explore how anonymous functions work. For the first
| argument of the evaluate function we're going to write a tiny function that fits on one
| line. In the second argument we'll pass some data to the tiny anonymous function in the
| first argument.

...

|=================================================== | 63%
| Type the following command and then we'll discuss how it works:
| evaluate(function(x){x+1}, 6)

evaluate(function(x){x+1}, 6)
[1] 7

| Keep up the great work!

|==================================================== | 65%
| The first argument is a tiny anonymous function that takes one argument x and returns
| x+1. We passed the number 6 into this function so the entire expression evaluates to
| 7.

...

|====================================================== | 67%
| Try using evaluate() along with an anonymous function to return the first element of
| the vector c(8, 4, 0). Your anonymous function should only take one argument which
| should be a variable x.

evaluate(function(x){x[1]},c(8, 4, 0))
[1] 8

| You are really on a roll!

|======================================================== | 69%
| Now try using evaluate() along with an anonymous function to return the last element of
| the vector c(8, 4, 0). Your anonymous function should only take one argument which
| should be a variable x.

evaluate(function(x){tail(x,n=1)},c(8, 4, 0))
[1] 0

| All that hard work is paying off!

|========================================================= | 71%
| For the rest of the course we're going to use the paste() function frequently. Type
| ?paste so we can take a look at the documentation for the paste function.

?paste

paste()

| Nice work!

|=========================================================== | 73%
| As you can see the first argument of paste() is ... which is referred to as an
| ellipsis or simply dot-dot-dot. The ellipsis allows an indefinite number of arguments
| to be passed into a function. In the case of paste() any number of strings can be
| passed as arguments and paste() will return all of the strings combined into one
| string.

...

|============================================================ | 76%
| Just to see how paste() works, type paste("Programming", "is", "fun!")

paste("Programming", "is", "fun!")
[1] "Programming is fun!"

| You are quite good my friend!

|============================================================== | 78%
| Time to write our own modified version of paste().

...

|================================================================ | 80%
| Make sure to save your script before you type submit().

# The ellipses can be used to pass on arguments to other functions that are
# used within the function you're writing. Usually a function that has the
# ellipses as an argument has the ellipses as the last argument. The usage of
# such a function would look like:
#
# ellipses_func(arg1, arg2 = TRUE, ...)
#
# In the above example arg1 has no default value, so a value must be provided
# for arg1. arg2 has a default value, and other arguments can come after arg2
# depending on how they're defined in the ellipses_func() documentation.
# Interestingly the usage for the paste function is as follows:
#
# paste (..., sep = " ", collapse = NULL)
#
# Notice that the ellipses is the first argument, and all other arguments after
# the ellipses have default values. This is a strict rule in R programming: all
# arguments after an ellipses must have default values. Take a look at the
# simon_says function below:
#
# simon_says <- function(...){
#   paste("Simon says:", ...)
# }
#
# The simon_says function works just like the paste function, except the
# begining of every string is prepended by the string "Simon says:"
#
# Telegrams used to be peppered with the words START and STOP in order to
# demarcate the beginning and end of sentences. Write a function below called 
# telegram that formats sentences for telegrams.
# For example the expression `telegram("Good", "morning")` should evaluate to:
# "START Good morning STOP"

telegram <- function(...){
  paste("START",...,"STOP")
}

submit()

| Sourcing your script...

| All that hard work is paying off!

|================================================================= | 82%
| Now let's test out your telegram function. Use your new telegram function passing in
| whatever arguments you wish!

telegram("Happy","birthday")
[1] "START Happy birthday STOP"

| You are doing so well!

|=================================================================== | 84%
| Make sure to save your script before you type submit().

# Let's explore how to "unpack" arguments from an ellipses when you use the
# ellipses as an argument in a function. Below I have an example function that
# is supposed to add two explicitly named arguments called alpha and beta.
# 
# add_alpha_and_beta <- function(...){
#   # First we must capture the ellipsis inside of a list
#   # and then assign the list to a variable. Let's name this
#   # variable `args`.
#
#   args <- list(...)
#
#   # We're now going to assume that there are two named arguments within args
#   # with the names `alpha` and `beta.` We can extract named arguments from
#   # the args list by using the name of the argument and double brackets. The
#   # `args` variable is just a regular list after all!
#   
#   alpha <- args[["alpha"]]
#   beta  <- args[["beta"]]
#
#   # Then we return the sum of alpha and beta.
#
#   alpha + beta 
# }
#
# Have you ever played Mad Libs before? The function below will construct a
# sentence from parts of speech that you provide as arguments. We'll write most
# of the function, but you'll need to unpack the appropriate arguments from the
# ellipses.

mad_libs <- function(...){
  # Do your argument unpacking here!
  args<-list(...)
  place<-args[["place"]]
  adjective<-args[["adjective"]]
  noun<-args[["noun"]]
  # Don't modify any code below this comment.
  # Notice the variables you'll need to create in order for the code below to
  # be functional!
  paste("News from", place, "today where", adjective, "students took to the streets in protest of the new", noun, "being installed on campus.")
}

submit()

| Sourcing your script...

| You are amazing!

|===================================================================== | 86%
| Time to use your mad_libs function. Make sure to name the place, adjective, and noun
| arguments in order for your function to work.

mad_libs(place="India",adjective="many",noun="'Get back to your classrooms and learn something' billboard")
[1] "News from India today where many students took to the streets in protest of the new 'Get back to your classrooms and learn something' billboard being installed on campus."

| You are doing so well!

|====================================================================== | 88%
| We're coming to the end of this lesson, but there's still one more idea you should be
| made aware of.

...

|======================================================================== | 90%
| You're familiar with adding, subtracting, multiplying, and dividing numbers in R. To do
| this you use the +, -, *, and / symbols. These symbols are called binary operators
| because they take two inputs, an input from the left and an input from the right.

...

|========================================================================= | 92%
| In R you can define your own binary operators. In the next script I'll show you how.

...

|=========================================================================== | 94%
| Make sure to save your script before you type submit().

# The syntax for creating new binary operators in R is unlike anything else in
# R, but it allows you to define a new syntax for your function. I would only
# recommend making your own binary operator if you plan on using it often!
#
# User-defined binary operators have the following syntax:
#      %[whatever]% 
# where [whatever] represents any valid variable name.
# 
# Let's say I wanted to define a binary operator that multiplied two numbers and
# then added one to the product. An implementation of that operator is below:
#
# "%mult_add_one%" <- function(left, right){ # Notice the quotation marks!
#   left * right + 1
# }
#
# I could then use this binary operator like `4 %mult_add_one% 5` which would
# evaluate to 21.
#
# Write your own binary operator below from absolute scratch! Your binary
# operator must be called %p% so that the expression:
#
#       "Good" %p% "job!"
#
# will evaluate to: "Good job!"

"%p%" <- function(x,y){ # Remember to add arguments!
  paste(x,y)
}

submit()

| Sourcing your script...

| Excellent job!

|============================================================================= | 96%
| You made your own binary operator! Let's test it out. Paste together the strings: 'I',
| 'love', 'R!' using your new binary operator.

"I" %p% "love" %p% "R!"
[1] "I love R!"

| You are doing so well!

|============================================================================== | 98%
| We've come to the end of our lesson! Go out there and write some great functions!

...

|================================================================================| 100%
| Would you like to receive credit for completing this course on Coursera.org?

1: Yes
2: No

Selection: 1
What is your email address? xxxxxx@xxxxxxxxxxxx
What is your assignment token? xXxXxxXXxXxxXXXx
Grade submission succeeded!

| Keep working like that and you'll get there!

| You've reached the end of this lesson! Returning to the main menu...

| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 0

| Leaving swirl now. Type swirl() to resume.

ls()
[1] "%p%" "boring_function" "evaluate" "mad_libs"
[5] "my_mean" "remainder" "telegram"
rm(list=ls())

Last updated 2020-04-15 14:35:14.297421 IST

Logic

R version 3.6.3 (2020-02-29) -- "Holding the Windsock"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

library("swirl")

| Hi! Type swirl() when you are ready to begin.

swirl()

| Welcome to swirl! Please sign in. If you've been here before, use the same name as you
| did then. If you are new, call yourself something unique.

What shall I call you? Krishnakanth Allika

| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 1

| Please choose a lesson, or type 0 to return to course menu.

1: Basic Building Blocks 2: Workspace and Files 3: Sequences of Numbers
4: Vectors 5: Missing Values 6: Subsetting Vectors
7: Matrices and Data Frames 8: Logic 9: Functions
10: lapply and sapply 11: vapply and tapply 12: Looking at Data
13: Simulation 14: Dates and Times 15: Base Graphics

Selection: 8

| | 0%

| This lesson is meant to be a short introduction to logical operations in R.

...

|== | 2%
| There are two logical values in R, also called boolean values. They are TRUE and FALSE.
| In R you can construct logical expressions which will evaluate to either TRUE or FALSE.

...

|=== | 4%
| Many of the questions in this lesson will involve evaluating logical expressions. It
| may be useful to open up a second R terminal where you can experiment with some of
| these expressions.

...

|===== | 6%
| Creating logical expressions requires logical operators. You're probably familiar with
| arithmetic operators like +, -, *, and /. The first logical operator we are
| going to discuss is the equality operator, represented by two equals signs ==. Use
| the equality operator below to find out if TRUE is equal to TRUE.

TRUE==TRUE
[1] TRUE

| That's correct!

|====== | 8%
| Just like arithmetic, logical expressions can be grouped by parenthesis so that the
| entire expression (TRUE == TRUE) == TRUE evaluates to TRUE.

...

|======== | 10%
| To test out this property, try evaluating (FALSE == TRUE) == FALSE .

(FALSE==TRUE)==TRUE
[1] FALSE

| Not quite right, but keep trying. Or, type info() for more options.

| Try typing: (FALSE == TRUE) == FALSE

(FALSE==TRUE)==FALSE
[1] TRUE

| All that hard work is paying off!

|========= | 12%
| The equality operator can also be used to compare numbers. Use == to see if 6 is
| equal to 7.

6==7
[1] FALSE

| Keep up the great work!

|=========== | 13%
| The previous expression evaluates to FALSE because 6 is less than 7. Thankfully, there
| are inequality operators that allow us to test if a value is less than or greater than
| another value.

...

|============ | 15%
| The less than operator < tests whether the number on the left side of the operator
| (called the left operand) is less than the number on the right side of the operator
| (called the right operand). Write an expression to test whether 6 is less than 7.

6<7
[1] TRUE

| That's correct!

|============== | 17%
| There is also a less-than-or-equal-to operator <= which tests whether the left
| operand is less than or equal to the right operand. Write an expression to test whether
| 10 is less than or equal to 10.

10<=10
[1] TRUE

| You got it!

|=============== | 19%
| Keep in mind that there are the corresponding greater than > and
| greater-than-or-equal-to >= operators.

...

|================= | 21%
| Which of the following evaluates to FALSE?

1: 7 == 7
2: 0 > -36
3: 6 < 8
4: 9 >= 10

Selection: 4

| That's correct!

|================== | 23%
| Which of the following evaluates to TRUE?

1: -6 > -7
2: 7 == 9
3: 9 >= 10
4: 57 < 8

Selection: 1

| That's correct!

|==================== | 25%
| The next operator we will discuss is the 'not equals' operator represented by !=. Not
| equals tests whether two values are unequal, so TRUE != FALSE evaluates to TRUE. Like
| the equality operator, != can also be used with numbers. Try writing an expression to
| see if 5 is not equal to 7.

5!=7
[1] TRUE

| You got it!

|====================== | 27%
| In order to negate boolean expressions you can use the NOT operator. An exclamation
| point ! will cause !TRUE (say: not true) to evaluate to FALSE and !FALSE (say: not
| false) to evaluate to TRUE. Try using the NOT operator and the equals operator to find
| the opposite of whether 5 is equal to 7.

!(5==7)
[1] TRUE

| You are really on a roll!

|======================= | 29%
| Let's take a moment to review. The equals operator == tests whether two boolean
| values or numbers are equal, the not equals operator != tests whether two boolean
| values or numbers are unequal, and the NOT operator ! negates logical expressions so
| that TRUE expressions become FALSE and FALSE expressions become TRUE.

...

|========================= | 31%
| Which of the following evaluates to FALSE?

1: !(0 >= -1)
2: !FALSE
3: 9 < 10
4: 7 != 8

Selection: 1

| That's a job well done!

|========================== | 33%
| What do you think the following expression will evaluate to?: (TRUE != FALSE) == !(6 ==
| 7)

1: %>%
2: FALSE
3: Can there be objective truth when programming?
4: TRUE

Selection: 4

| You are amazing!

|============================ | 35%
| At some point you may need to examine relationships between multiple logical
| expressions. This is where the AND operator and the OR operator come in.

...

|============================= | 37%
| Let's look at how the AND operator works. There are two AND operators in R, & and
| &&. Both operators work similarly, if the right and left operands of AND are both
| TRUE the entire expression is TRUE, otherwise it is FALSE. For example, TRUE & TRUE
| evaluates to TRUE. Try typing FALSE & FALSE to how it is evaluated.

FALSE&FALSE
[1] FALSE

| Your dedication is inspiring!

|=============================== | 38%
| You can use the & operator to evaluate AND across a vector. The && version of AND
| only evaluates the first member of a vector. Let's test both for practice. Type the
| expression TRUE & c(TRUE, FALSE, FALSE).

TRUE&c(TRUE,FALSE,FALSE)
[1] TRUE FALSE FALSE

| That's a job well done!

|================================ | 40%
| What happens in this case is that the left operand TRUE is recycled across every
| element in the vector of the right operand. This is the equivalent statement as c(TRUE,
| TRUE, TRUE) & c(TRUE, FALSE, FALSE).

...

|================================== | 42%
| Now we'll type the same expression except we'll use the && operator. Type the
| expression TRUE && c(TRUE, FALSE, FALSE).

TRUE&&c(TRUE,FALSE,FALSE)
[1] TRUE

| That's correct!

|=================================== | 44%
| In this case, the left operand is only evaluated with the first member of the right
| operand (the vector). The rest of the elements in the vector aren't evaluated at all in
| this expression.

...

|===================================== | 46%
| The OR operator follows a similar set of rules. The | version of OR evaluates OR
| across an entire vector, while the || version of OR only evaluates the first member
| of a vector.

...

|====================================== | 48%
| An expression using the OR operator will evaluate to TRUE if the left operand or the
| right operand is TRUE. If both are TRUE, the expression will evaluate to TRUE, however
| if neither are TRUE, then the expression will be FALSE.

...

|======================================== | 50%
| Let's test out the vectorized version of the OR operator. Type the expression TRUE |
| c(TRUE, FALSE, FALSE).

TRUE|c(TRUE, FALSE, FALSE)
[1] TRUE TRUE TRUE

| You're the best!

|========================================== | 52%
| Now let's try out the non-vectorized version of the OR operator. Type the expression
| TRUE || c(TRUE, FALSE, FALSE).

TRUE||c(TRUE, FALSE, FALSE)
[1] TRUE

| You are really on a roll!

|=========================================== | 54%
| Logical operators can be chained together just like arithmetic operators. The
| expressions: 6 != 10 && FALSE && 1 >= 2 or TRUE || 5 < 9.3 || FALSE are perfectly
| normal to see.

...

|============================================= | 56%
| As you may recall, arithmetic has an order of operations and so do logical expressions.
| All AND operators are evaluated before OR operators. Let's look at an example of an
| ambiguous case. Type: 5 > 8 || 6 != 8 && 4 > 3.9

5 > 8 || 6 != 8 && 4 > 3.9
[1] TRUE

| You are quite good my friend!

|============================================== | 58%
| Let's walk through the order of operations in the above case. First the left and right
| operands of the AND operator are evaluated. 6 is not equal 8, 4 is greater than 3.9,
| therefore both operands are TRUE so the resulting expression TRUE && TRUE evaluates
| to TRUE. Then the left operand of the OR operator is evaluated: 5 is not greater than 8
| so the entire expression is reduced to FALSE || TRUE. Since the right operand of this
| expression is TRUE the entire expression evaluates to TRUE.

...

|================================================ | 60%
| Which one of the following expressions evaluates to TRUE?

1: TRUE && FALSE || 9 >= 4 && 3 < 6
2: FALSE || TRUE && FALSE
3: 99.99 > 100 || 45 < 7.3 || 4 != 4.0
4: TRUE && 62 < 62 && 44 >= 44

Selection: 1

| Excellent work!

|================================================= | 62%
| Which one of the following expressions evaluates to FALSE?

1: FALSE && 6 >= 6 || 7 >= 8 || 50 <= 49.5
2: FALSE || TRUE && 6 != 4 || 9 > 4
3: 6 >= -9 && !(6 > 7) && !(!TRUE)
4: !(8 > 4) || 5 == 5.0 && 7.8 >= 7.79

Selection: 1

| You got it!

|=================================================== | 63%
| Now that you're familiar with R's logical operators you can take advantage of a few
| functions that R provides for dealing with logical expressions.

...

|==================================================== | 65%
| The function isTRUE() takes one argument. If that argument evaluates to TRUE, the
| function will return TRUE. Otherwise, the function will return FALSE. Try using this
| function by typing: isTRUE(6 > 4)

isTRUE(6>4)
[1] TRUE

| You got it right!

|====================================================== | 67%
| Which of the following evaluates to TRUE?

1: isTRUE(NA)
2: isTRUE(!TRUE)
3: !isTRUE(4 < 3)
4: isTRUE(3)
5: !isTRUE(8 != 5)

Selection: 5

| You're close...I can feel it! Try it again.

| isTRUE() will only return TRUE if the statement passed to it as an argument is TRUE.

1: isTRUE(NA)
2: !isTRUE(4 < 3)
3: isTRUE(3)
4: !isTRUE(8 != 5)
5: isTRUE(!TRUE)

Selection: 2

| You are amazing!

|======================================================= | 69%
| The function identical() will return TRUE if the two R objects passed to it as
| arguments are identical. Try out the identical() function by typing: identical('twins',
| 'twins')

identical('twins','twins')
[1] TRUE

| Excellent job!

|========================================================= | 71%
| Which of the following evaluates to TRUE?

1: identical(5 > 4, 3 < 3.1)
2: !identical(7, 7)
3: identical(4, 3.1)
4: identical('hello', 'Hello')

Selection: 1

| You are amazing!

|========================================================== | 73%
| You should also be aware of the xor() function, which takes two arguments. The xor()
| function stands for exclusive OR. If one argument evaluates to TRUE and one argument
| evaluates to FALSE, then this function will return TRUE, otherwise it will return
| FALSE. Try out the xor() function by typing: xor(5 == 6, !FALSE)

xor(5 == 6, !FALSE)
[1] TRUE

| Your dedication is inspiring!

|============================================================ | 75%
| 5 == 6 evaluates to FALSE, !FALSE evaluates to TRUE, so xor(FALSE, TRUE) evaluates to
| TRUE. On the other hand if the first argument was changed to 5 == 5 and the second
| argument was unchanged then both arguments would have been TRUE, so xor(TRUE, TRUE)
| would have evaluated to FALSE.

...

|============================================================== | 77%
| Which of the following evaluates to FALSE?

1: xor(!!TRUE, !!FALSE)
2: xor(4 >= 9, 8 != 8.0)
3: xor(identical(xor, 'xor'), 7 == 7.0)
4: xor(!isTRUE(TRUE), 6 > -1)

Selection: 8!=8.0
Enter an item from the menu, or 0 to exit
Selection: 2

| That's the answer I was looking for.

|=============================================================== | 79%
| For the next few questions, we're going to need to create a vector of integers called
| ints. Create this vector by typing: ints <- sample(10)

ints<-sample(10)

| You got it!

|================================================================= | 81%
| Now simply display the contents of ints.

ints
[1] 7 1 3 4 8 2 10 6 9 5

| You are quite good my friend!

|================================================================== | 83%
| The vector ints is a random sampling of integers from 1 to 10 without replacement.
| Let's say we wanted to ask some logical questions about contents of ints. If we type
| ints > 5, we will get a logical vector corresponding to whether each element of ints is
| greater than 5. Try typing: ints > 5

ints>5
[1] TRUE FALSE FALSE FALSE TRUE FALSE TRUE TRUE TRUE FALSE

| Perseverance, that's the answer.

|==================================================================== | 85%
| We can use the resulting logical vector to ask other questions about ints. The which()
| function takes a logical vector as an argument and returns the indices of the vector
| that are TRUE. For example which(c(TRUE, FALSE, TRUE)) would return the vector c(1, 3).

...

|===================================================================== | 87%
| Use the which() function to find the indices of ints that are greater than 7.

which(int>7)
Error in which(int > 7) : object 'int' not found
which(ints>7)
[1] 5 7 9

| Great job!

|======================================================================= | 88%
| Which of the following commands would produce the indices of the elements in ints that
| are less than or equal to 2?

1: ints < 2
2: ints <= 2
3: which(ints <= 2)
4: which(ints < 2)

Selection: 3

| All that practice is paying off!

|======================================================================== | 90%
| Like the which() function, the functions any() and all() take logical vectors as their
| argument. The any() function will return TRUE if one or more of the elements in the
| logical vector is TRUE. The all() function will return TRUE if every element in the
| logical vector is TRUE.

...

|========================================================================== | 92%
| Use the any() function to see if any of the elements of ints are less than zero.

any(ints<0)
[1] FALSE

| Nice work!

|=========================================================================== | 94%
| Use the all() function to see if all of the elements of ints are greater than zero.

all(ints>0)
[1] TRUE

| You're the best!

|============================================================================= | 96%
| Which of the following evaluates to TRUE?

1: any(ints == 2.5)
2: all(c(TRUE, FALSE, TRUE))
3: all(ints == 10)
4: any(ints == 10)

Selection: 4

| Keep up the great work!

|============================================================================== | 98%
| That's all for this introduction to logic in R. If you really want to see what you can
| do with logic, check out the control flow lesson!

...

|================================================================================| 100%
| Would you like to receive credit for completing this course on Coursera.org?

1: No
2: Yes

Selection: 2
What is your email address? xxxxxx@xxxxxxxxxxxx
What is your assignment token? xXxXxxXXxXxxXXXx
Grade submission succeeded!

| That's a job well done!

| You've reached the end of this lesson! Returning to the main menu...

| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 0

| Leaving swirl now. Type swirl() to resume.

ls()
[1] "ints"
rm(list=ls())

Last updated 2020-04-15 14:05:54.787418 IST

Matrices and Data Frames

swirl()

| Welcome to swirl! Please sign in. If you've been here before, use the same name as you
| did then. If you are new, call yourself something unique.

What shall I call you? Krishnakanth Allika

| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 1

| Please choose a lesson, or type 0 to return to course menu.

1: Basic Building Blocks 2: Workspace and Files 3: Sequences of Numbers
4: Vectors 5: Missing Values 6: Subsetting Vectors
7: Matrices and Data Frames 8: Logic 9: Functions
10: lapply and sapply 11: vapply and tapply 12: Looking at Data
13: Simulation 14: Dates and Times 15: Base Graphics

Selection: 7

| | 0%

| In this lesson, we'll cover matrices and data frames. Both represent 'rectangular' data
| types, meaning that they are used to store tabular data, with rows and columns.

...

|== | 3%
| The main difference, as you'll see, is that matrices can only contain a single class of
| data, while data frames can consist of many different classes of data.

...

|==== | 6%
| Let's create a vector containing the numbers 1 through 20 using the : operator. Store
| the result in a variable called my_vector.

my_vector<-1:20

| That's a job well done!

|======= | 8%
| View the contents of the vector you just created.

my_vector
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

| You got it right!

|========= | 11%
| The dim() function tells us the 'dimensions' of an object. What happens if we do
| dim(my_vector)? Give it a try.

dim(my_vector)
NULL

| Keep up the great work!

|=========== | 14%
| Clearly, that's not very helpful! Since my_vector is a vector, it doesn't have a dim
| attribute (so it's just NULL), but we can find its length using the length() function.
| Try that now.

length(my_vector)
[1] 20

| Excellent job!

|============= | 17%
| Ah! That's what we wanted. But, what happens if we give my_vector a dim attribute?
| Let's give it a try. Type dim(my_vector) <- c(4, 5).

dim(my_vector)<-c(4,5)

| That's the answer I was looking for.

|================ | 19%
| It's okay if that last command seemed a little strange to you. It should! The dim()
| function allows you to get OR set the dim attribute for an R object. In this case, we
| assigned the value c(4, 5) to the dim attribute of my_vector.

...

|================== | 22%
| Use dim(my_vector) to confirm that we've set the dim attribute correctly.

dim(my_vector)
[1] 4 5

| Great job!

|==================== | 25%
| Another way to see this is by calling the attributes() function on my_vector. Try it
| now.

attributes(my_vector)
$dim
[1] 4 5

| Great job!

|====================== | 28%
| Just like in math class, when dealing with a 2-dimensional object (think rectangular
| table), the first number is the number of rows and the second is the number of columns.
| Therefore, we just gave my_vector 4 rows and 5 columns.

...

|======================== | 31%
| But, wait! That doesn't sound like a vector any more. Well, it's not. Now it's a
| matrix. View the contents of my_vector now to see what it looks like.

my_vector
[,1] [,2] [,3] [,4] [,5]
[1,] 1 5 9 13 17
[2,] 2 6 10 14 18
[3,] 3 7 11 15 19
[4,] 4 8 12 16 20

| You got it right!

|=========================== | 33%
| Now, let's confirm it's actually a matrix by using the class() function. Type
| class(my_vector) to see what I mean.

class(my_vector)
[1] "matrix"

| You are quite good my friend!

|============================= | 36%
| Sure enough, my_vector is now a matrix. We should store it in a new variable that helps
| us remember what it is. Store the value of my_vector in a new variable called
| my_matrix.

my_matrix<-my_vector

| That's correct!

|=============================== | 39%
| The example that we've used so far was meant to illustrate the point that a matrix is
| simply an atomic vector with a dimension attribute. A more direct method of creating
| the same matrix uses the matrix() function.

...

|================================= | 42%
| Bring up the help file for the matrix() function now using the ? function.

?matrix

| That's correct!

|==================================== | 44%
| Now, look at the documentation for the matrix function and see if you can figure out
| how to create a matrix containing the same numbers (1-20) and dimensions (4 rows, 5
| columns) by calling the matrix() function. Store the result in a variable called
| my_matrix2.

my_matrix2<-matrix(data=1:20,nrow = 4,ncol = 5)

| You are quite good my friend!

|====================================== | 47%
| Finally, let's confirm that my_matrix and my_matrix2 are actually identical. The
| identical() function will tell us if its first two arguments are the same. Try it out.

identical(my_matrix,my_matrix2)
[1] TRUE

| You are amazing!

|======================================== | 50%
| Now, imagine that the numbers in our table represent some measurements from a clinical
| experiment, where each row represents one patient and each column represents one
| variable for which measurements were taken.

...

|========================================== | 53%
| We may want to label the rows, so that we know which numbers belong to each patient in
| the experiment. One way to do this is to add a column to the matrix, which contains the
| names of all four people.

...

|============================================ | 56%
| Let's start by creating a character vector containing the names of our patients --
| Bill, Gina, Kelly, and Sean. Remember that double quotes tell R that something is a
| character string. Store the result in a variable called patients.

patients<-c("Bill","Gina","Kelly","Sean")

| That's correct!

|=============================================== | 58%
| Now we'll use the cbind() function to 'combine columns'. Don't worry about storing the
| result in a new variable. Just call cbind() with two arguments -- the patients vector
| and my_matrix.

cbind(patients,my_matrix)
patients
[1,] "Bill" "1" "5" "9" "13" "17"
[2,] "Gina" "2" "6" "10" "14" "18"
[3,] "Kelly" "3" "7" "11" "15" "19"
[4,] "Sean" "4" "8" "12" "16" "20"

| All that practice is paying off!

|================================================= | 61%
| Something is fishy about our result! It appears that combining the character vector
| with our matrix of numbers caused everything to be enclosed in double quotes. This
| means we're left with a matrix of character strings, which is no good.

...

|=================================================== | 64%
| If you remember back to the beginning of this lesson, I told you that matrices can only
| contain ONE class of data. Therefore, when we tried to combine a character vector with
| a numeric matrix, R was forced to 'coerce' the numbers to characters, hence the double
| quotes.

...

|===================================================== | 67%
| This is called 'implicit coercion', because we didn't ask for it. It just happened. But
| why didn't R just convert the names of our patients to numbers? I'll let you ponder
| that question on your own.

...

|======================================================== | 69%
| So, we're still left with the question of how to include the names of our patients in
| the table without destroying the integrity of our numeric data. Try the following --
| my_data <- data.frame(patients, my_matrix)

my_data<-data.frame(patients,my_matrix)

| Your dedication is inspiring!

|========================================================== | 72%
| Now view the contents of my_data to see what we've come up with.

my_data
patients X1 X2 X3 X4 X5
1 Bill 1 5 9 13 17
2 Gina 2 6 10 14 18
3 Kelly 3 7 11 15 19
4 Sean 4 8 12 16 20

| You are doing so well!

|============================================================ | 75%
| It looks like the data.frame() function allowed us to store our character vector of
| names right alongside our matrix of numbers. That's exactly what we were hoping for!

...

|============================================================== | 78%
| Behind the scenes, the data.frame() function takes any number of arguments and returns
| a single object of class data.frame that is composed of the original objects.

...

|================================================================ | 81%
| Let's confirm this by calling the class() function on our newly created data frame.

class(my_data)
[1] "data.frame"

| Excellent work!

|=================================================================== | 83%
| It's also possible to assign names to the individual rows and columns of a data frame,
| which presents another possible way of determining which row of values in our table
| belongs to each patient.

...

|===================================================================== | 86%
| However, since we've already solved that problem, let's solve a different problem by
| assigning names to the columns of our data frame so that we know what type of
| measurement each column represents.

...

|======================================================================= | 89%
| Since we have six columns (including patient names), we'll need to first create a
| vector containing one element for each column. Create a character vector called cnames
| that contains the following values (in order) -- "patient", "age", "weight", "bp",
| "rating", "test".

cnames<-c("patient", "age", "weight", "bp", "rating", "test")

| Excellent job!

|========================================================================= | 92%
| Now, use the colnames() function to set the colnames attribute for our data frame.
| This is similar to the way we used the dim() function earlier in this lesson.

colnames(my_data)<-cnames

| Excellent work!

|============================================================================ | 94%
| Let's see if that got the job done. Print the contents of my_data.

my_data
patient age weight bp rating test
1 Bill 1 5 9 13 17
2 Gina 2 6 10 14 18
3 Kelly 3 7 11 15 19
4 Sean 4 8 12 16 20

| Excellent work!

|============================================================================== | 97%
| In this lesson, you learned the basics of working with two very important and common
| data structures -- matrices and data frames. There's much more to learn and we'll be
| covering more advanced topics, particularly with respect to data frames, in future
| lessons.

...

|================================================================================| 100%
| Would you like to receive credit for completing this course on Coursera.org?

1: No
2: Yes

Selection: 2
What is your email address? xxxxxx@xxxxxxxxxxxx
What is your assignment token? xXxXxxXXxXxxXXXx
Grade submission succeeded!

| You got it right!

| You've reached the end of this lesson! Returning to the main menu...

| Please choose a course, or type 0 to exit swirl.

1: R Programming
2: Take me to the swirl course repository!

Selection: 0

| Leaving swirl now. Type swirl() to resume.

ls()
[1] "cnames" "my_data" "my_matrix" "my_matrix2" "my_vector" "patients"
rm(list=ls())

Last updated 2020-04-14 10:18:05.910590 IST