What is a convenience sample?
A convenience sample consists of individuals who are easily accessible and are more likely to be included in the sample.
What is a confounding variable?
A confounding variable is an external variable that affects both the independent and dependent variables, potentially leading to a false assumption of causality between them. In the example, temperature is a confounding variable that affects both ice cream sales and the number of shark attacks.
1/43
p.7
Sampling Techniques and Bias

What is a convenience sample?

A convenience sample consists of individuals who are easily accessible and are more likely to be included in the sample.

p.12
Relationships Among Variables

What is a confounding variable?

A confounding variable is an external variable that affects both the independent and dependent variables, potentially leading to a false assumption of causality between them. In the example, temperature is a confounding variable that affects both ice cream sales and the number of shark attacks.

p.20
Introduction to Data Analysis

What is R?

R is a programming language and free software environment used for statistical computing and graphics.

p.25
Relationships Among Variables

What is the correlation between two variables?

Correlation is a statistical measure that expresses the extent to which two variables are linearly related. It ranges from -1 to 1, where 1 indicates a perfect positive linear relationship, -1 indicates a perfect negative linear relationship, and 0 indicates no linear relationship.

p.38
Using ggplot2 for Data Visualization

What is the purpose of the 'caption' argument in the 'labs' function in ggplot2?

The 'caption' argument in the 'labs' function in ggplot2 is used to add a caption to the plot, typically to provide information about the data source.

p.36
Using ggplot2 for Data Visualization

What does the 'labs' function do in ggplot2?

The 'labs' function in ggplot2 is used to add labels to the plot, including the title, subtitle, and axis labels.

p.5
Introduction to Data Analysis

What is an explanatory variable?

An explanatory variable is a variable that is suspected to causally affect another variable, which is labeled as the response variable.

p.32
Using ggplot2 for Data Visualization

What function is used to create the mapping from dataset variables to the plot’s aesthetics in ggplot2?

The function aes() is used to create the mapping from dataset variables to the plot’s aesthetics in ggplot2.

p.39
Using ggplot2 for Data Visualization

What is the purpose of using the 'scale_colour_viridis_d()' function in ggplot2?

The 'scale_colour_viridis_d()' function in ggplot2 is used to apply a discrete colour scale that is designed to be perceived by viewers with common colour blindness.

p.46
Using ggplot2 for Data Visualization

geom_point

A function in ggplot2 used to create scatter plots. It adds points to a plot, with options to adjust size and transparency (alpha).

p.17
Principles of Experimental Design

What are treatment variables?

Treatment variables are conditions we can impose on the experimental units.

p.34
Using ggplot2 for Data Visualization

What function is used to add a title to a plot in ggplot2?

The `labs()` function is used to add a title to a plot in ggplot2. For example, `labs(title = "Bill depth and length")`.

p.31
Using ggplot2 for Data Visualization

What function is used to initialize a plot in ggplot2?

The function used to initialize a plot in ggplot2 is ggplot().

p.3
Types of Variables in Data

What is a numerical variable?

A numerical variable can take a wide range of numerical values, and it is sensible to add, subtract, or take averages with those values.

p.4
Relationships Among Variables

What are associated or dependent variables?

When two variables show some connection with one another, they are called associated or dependent variables.

p.21
Types of Variables in Data

What is an observation in a dataset?

An observation is each row in a dataset.

p.35
Using ggplot2 for Data Visualization

What is the function of the 'labs' function in ggplot2?

The 'labs' function in ggplot2 is used to add labels to the plot, such as the title, subtitle, and axis labels.

p.45
Using ggplot2 for Data Visualization

What is 'mapping' in ggplot2?

'Mapping' in ggplot2 refers to the process of linking data variables to visual properties (aesthetics) of the plot, such as x and y coordinates, size, color, and alpha. This is done using the aes() function.

p.28
Using ggplot2 for Data Visualization

What does the `geom_point()` function do in ggplot2?

`geom_point()` is a function in ggplot2 that adds a layer of points to a plot, creating a scatter plot. Each point represents an observation in the dataset.

p.49
Using ggplot2 for Data Visualization

What is 'facet_grid' used for in ggplot2?

'facet_grid' is used in ggplot2 to create a grid of plots based on the values of two categorical variables, allowing for the comparison of data across these variables.

p.48
Using ggplot2 for Data Visualization

What is faceting by species and sex in ggplot2?

Faceting by species and sex in ggplot2 involves creating a grid of plots where each plot represents a subset of the data divided by the levels of two categorical variables, in this case, species and sex. This allows for the comparison of relationships within each subset.

p.15
Sampling Techniques and Bias

What is multistage sampling?

Multistage sampling involves taking a simple random sample of clusters and then taking a simple random sample within each sampled cluster. It is more economical than other sampling techniques and is useful when there is large case-to-case variability within a cluster, but the clusters themselves do not look very different.

p.10
Observational Studies vs. Experiments

What is an observational study?

An observational study is a type of research where data is collected without directly interfering with how the data arise, meaning researchers merely observe. In this case, only a relationship between the explanatory and the response variables can be established.

p.29
Using ggplot2 for Data Visualization

What is the main function in ggplot2 that initializes the plot?

ggplot() is the main function in ggplot2. It initializes the plot. The different layers of the plots are then added consecutively.

p.23
Introduction to Data Analysis

What is Exploratory Data Analysis (EDA)?

Exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics. Often, EDA is visual, but it might also involve calculating summary statistics and performing data transformation.

p.40
Data Visualization Techniques

What are commonly used aesthetics in a graphic?

Commonly used aesthetics of a graphic are colour, shape, size, or alpha (transparency).

p.6
Introduction to Data Analysis

What is anecdotal evidence?

Anecdotal evidence is evidence based on a limited sample size that might not be representative of the population, often relying on personal stories or isolated examples.

p.37
Using ggplot2 for Data Visualization

What does the 'colour' aesthetic in ggplot2 represent in the provided code?

In the provided ggplot2 code, the 'colour' aesthetic represents the 'species' of the penguins, which differentiates the points on the scatter plot by species.

p.44
Using ggplot2 for Data Visualization

What does the alpha aesthetic do in ggplot2?

The alpha aesthetic introduces different levels of transparency.

p.11
Relationships Among Variables

What are confounding variables?

Extraneous variables that affect both the explanatory and the response variable and that make it seem like there is a relationship between the two.

p.13
Sampling Techniques and Bias

What is simple random sampling?

Simple random sampling is a technique where cases are randomly selected from the population without any implied connection between the selected points. Each case in the population has an equal chance of being included in the final sample.

p.16
Principles of Experimental Design

What is the principle of 'Block' in experimental design?

If there are variables that are known or suspected to affect the response variable, first group subjects into blocks based on these variables, and then randomize cases within each block to treatment groups.

p.33
Using ggplot2 for Data Visualization

What does the 'geom_point()' function do in ggplot2?

The 'geom_point()' function in ggplot2 adds a layer of points to a plot, which is useful for creating scatter plots.

p.43
Using ggplot2 for Data Visualization

What type of variable can be used for the shape aesthetic in ggplot2?

The values of shape can only be specified by a discrete variable. Using a continuous variable will lead to an error.

p.1
Statistical Significance and Random Variation

What is random variation?

Random variation refers to the natural fluctuations that occur in any data-generating process. For example, when flipping a coin 100 times, while the chance of landing heads in any given flip is 50%, we probably won’t observe exactly 50 heads. This type of fluctuation is part of almost any type of data-generating process.

p.42
Using ggplot2 for Data Visualization

What is the function of the 'aes' argument in ggplot?

The 'aes' argument in ggplot is used to specify the aesthetic mappings, such as which variables to map to the x and y axes, and which variables to use for color, shape, and other visual properties.

p.30
Data Visualization Techniques

What is the 'glimpse' function used for in R?

The 'glimpse' function in R is used to provide a quick overview of a data frame, displaying the number of rows and columns, and a preview of the data in each column.

p.41
Using ggplot2 for Data Visualization

shape

In addition to specifying colour with respect to species, we now define shape based on island.

p.27
Using ggplot2 for Data Visualization

What does the 'gg' in ggplot2 stand for?

The 'gg' in ggplot2 stands for Grammar of Graphics, which is a tool that enables us to concisely describe the components of a graphic.

p.50
Using ggplot2 for Data Visualization

What does the function facet_wrap() do in ggplot2?

facet_wrap() allows for specifying the number of columns (or rows) in the output when creating faceted plots in ggplot2.

p.18
Principles of Experimental Design

What is a placebo?

A placebo is a fake treatment, often used as the control group for medical studies.

p.47
Data Visualization Techniques

What is faceting in data visualization?

Faceting means creating smaller plots that display different subsets of the data. It is useful for exploring conditional relationships and large data.

p.14
Sampling Techniques and Bias

What is stratified sampling?

Stratified sampling is especially useful when the cases in each stratum are very similar in terms of the outcome of interest.

Study Smarter, Not Harder
Study Smarter, Not Harder