EduNinja

IB Maths AI SL4.1 Statistics and probability - SL contentQuestion Bank

Question 1

[Maximum number: 11]

Billy is a keen walker who keeps a record of his performance. The following table shows the time, in minutes, it takes him to walk one kilometre up hills with different gradients. The gradient of each hill is constant.

Table

Question 1(a)

Question 1(a)(i)

(a)
(i)

Find the equation of the regression line of T on G.

[ 4 ]

Question 1(a)(ii)

(ii)

Describe the correlation between T and G with reference to the value of r, the Pearson's product-moment correlation coefficient.

On Sunday, Billy intends to walk up a hill with a gradient of 13 %.

[ 4 ]

Question 1(b)

(b)

Estimate the time it will take Billy to walk one kilometre up the hill.

This morning, Billy walked one kilometre up a hill, and it took 22 minutes.

[ 2 ]

Question 1(c)

(c)

Explain why it would be inappropriate to use the equation found in part (a) to estimate the gradient of this hill.

[ 1 ]

Question 1

[Maximum number: 15]

Dr Petrillo wrote a short scientific essay. He analysed the readability of his essay by counting the number of letters in each word.
Dr Petrillo constructs a box and whisker diagram for his data.

Question image

Question 1(a)

(a)

Write down

[ 3 ]

Question 1(a)(i)

(i)

the median;

Question 1(a)(ii)

(ii)

the upper quartile, Q3\mathrm{Q}_{3};

Question 1(a)(iii)

(iii)

the interquartile range, IQR.

Dr Petrillo now wants to modify his diagram to show any outliers. He considers the longer words in his data and uses the following formula:

 outliers >(1.5×IQR)+Q3\text { outliers }>(1.5 \times \mathrm{IQR})+\mathrm{Q}_{3}

Words with at least k letters are considered outliers.

[ 3 ]

Question 1(b)

(b)

Find the value of k.

[ 2 ]

Question 1(c)

(c)

Dr Petrillo further considers the outliers and sees no reason to exclude them from his analysis.
The length of each word in the essay, n, and its associated frequency are given in the following table.

Table

Use the mid-interval values to calculate an estimate of the mean number of letters in each word.

Dr Petrillo conducts a χ2\chi^{2} goodness of fit test at the 1 % significance level, to test the following null hypothesis:
H0\mathrm{H}_{0} : The frequency of the number of letters in each word in his essay is consistent with the English language.

[ 3 ]

Question 1(d)

(d)

Write down the alternative hypothesis for this test.

[ 1 ]

Question 1(e)

(e)

The observed and expected frequencies of the number of letters in each word in his essay are listed in the following table.

Table
[ 4 ]

Question 1(e)(i)

(i)

Write down the number of degrees of freedom.

Question 1(e)(ii)

(ii)

Find the χ2\chi^{2} statistic for this test.

Question 1(e)(iii)

(iii)

Find the p-value for this test.

The critical value for this test, at the 1 % significance level, is 16.812 .

[ 4 ]

Question 1(f)

(f)

State whether Dr Petrillo should reject the null hypothesis. Justify your answer.

[ 2 ]

Question 1

[Maximum number: 5]

Consider the following list of 10 data items.

16192119181216212132\begin{array}{llllllllll} 16 & 19 & 21 & 19 & 18 & 12 & 16 & 21 & 21 & 32 \end{array}

This data has a mean of 19.5.
Find the value of the

Question 1(a)

(a)

mode.

[ 1 ]

Question 1(b)

(b)

median.

[ 1 ]

Question 1(c)

(c)

standard deviation.

[ 1 ]

Question 1(d)

(d)

range.

[ 2 ]

Question 1

[Maximum number: 18]

Paul has a bar graph for the total number of goals scored in each game of a soccer tournament in 2024. The bar graph is shown below, however the frequency of 4 goals in a game is unreadable.
Paul uses this bar graph to create a frequency table.

Question image
Frequency table

Frequency table

Question 1(a)

(a)

Write down the value of k.

Paul knows that the mean number of goals per game scored during the tournament was 2.2 .

[ 1 ]

Question 1(b)

Question 1(b)(i)

(b)
(i)

Write down an equation for the mean in terms of p.

[ 3 ]

Question 1(b)(ii)

(ii)

Determine the value of p.

Data for the number of goals per game in the 2025 soccer tournament are shown in the following box and whisker diagram.

Question image

After comparing the box and whisker diagram from the 2025 tournament with the frequency table from the 2024 tournament, Paul concludes that the distribution of goals is consistent between the two tournaments.

[ 3 ]

Question 1(c)

(c)

State two observations that support Paul's conclusion using values from the data to compare any two of:
range, symmetry, median, and interquartile range.

Paul plans to watch all the games from the 2024 tournament in a random order.
He will watch each game once.
For the first game he watches, he defines event F as:
"scoring either 0 goals or exactly 1 goal".

[ 3 ]

Question 1(d)

(d)

Write down the event(s) from the table that are equivalent to FF^{\prime}. There may be more than one correct event.

Table
[ 2 ]

Question 1(e)

(e)

If exactly 1 goal was scored in the first game Paul watches, write down the probability that exactly 1 goal was scored in the second game he watches. Give your answer as a fraction.

[ 2 ]

Question 1(f)

(f)

Calculate the probability that 5 goals were scored in the first game that Paul watches and 0 goals were scored in the second game he watches.

[ 4 ]

Question 1

[Maximum number: 16]

A group of 1280 students were asked which electronic device they preferred. The results per age group are given in the following table.

Table

Question 1(a)

(a)

A student from the group is chosen at random. Calculate the probability that the student

[ 9 ]

Question 1(a)(i)

(i)

prefers a tablet.

Question 1(a)(ii)

(ii)

is 11-13 years old and prefers a mobile phone.

Question 1(a)(iii)

(iii)

prefers a laptop given that they are 17-18 years old.

Question 1(a)(iv)

(iv)

prefers a tablet or is 14-16 years old.

A χ2\chi^{2} test for independence was performed on the collected data at the 1 % significance level. The critical value for the test is 13.277 .

[ 9 ]

Question 1(b)

(b)

State the null and alternative hypotheses.

[ 1 ]

Question 1(c)

(c)

Write down the number of degrees of freedom.

[ 1 ]

Question 1(d)

Question 1(d)(i)

(d)
(i)

Write down the χ2\chi^{2} test statistic.

Question 1(d)(ii)

(ii)

Write down the p-value.

Question 1(d)(iii)

(iii)

State the conclusion for the test in context. Give a reason for your answer.

[ 5 ]

Question 1

[Maximum number: 11]

Joel is a keen cyclist who keeps a record of his performance. The following table shows the time, in minutes, it takes him to ride one kilometre on hills with different gradients. The gradient of each hill is constant.

Table

Question 1(a)

Question 1(a)(i)

(a)
(i)

Find the equation of the regression line of T on G.

[ 4 ]

Question 1(a)(ii)

(ii)

Describe the correlation between T and G with reference to the value of r, the Pearson's product-moment correlation coefficient.

On Saturday, Joel intends to ride a hill with a gradient of 17 %.

[ 4 ]

Question 1(b)

(b)

Estimate the time it will take Joel to ride one kilometre up the hill.

This morning, Joel rode one kilometre up a hill, and it took 22 minutes.

[ 2 ]

Question 1(c)

(c)

Explain why it would be inappropriate to use the equation found in part (a) to estimate the gradient of this hill.

[ 1 ]

Question 1

[Maximum number: 25]

Xavie conducted a study to see if there is a relationship between the price of an apartment, y, and its distance, x, from the city centre of Melbourne.
They took a random sample of six typical apartments along a train line in the city. Xavie obtained the data shown in the following table.

Table

A plot of these data is seen in the following graph.

Question image

Question 1(a)

(a)

Write down the value of the Spearman's rank correlation coefficient, rsr_{s}.

[ 1 ]

Question 1(b)

Question 1(b)(i)

(b)
(i)

Find the Pearson's product-moment correlation coefficient, r.

[ 4 ]

Question 1(b)(ii)

(ii)

Use your value of r to state which two of the following would best describe the correlation between the variables.

Table

The relationship between the variables can be modelled by the regression equation y=a x+b.

[ 4 ]

Question 1(c)

Question 1(c)(i)

(c)
(i)

Write down the value of a.

[ 3 ]

Question 1(c)(ii)

(ii)

Write down the value of b.

Question 1(c)(iii)

(iii)

According to this model, state in context what the value of b represents.

[ 3 ]

Question 1(d)

(d)

Xavie uses the regression equation to estimate the price of a typical apartment located 19.6 km from the city centre.

[ 5 ]

Question 1(d)(i)

(i)

Find this estimated price.

Question 1(d)(ii)

(ii)

State two reasons that Xavie might use to justify the validity of this estimate.

To verify whether this relationship applies in a different direction from the city centre, Xavie considers two locations, A and B , both an equal distance from the city centre. They take a random sample of seven apartments from each location and record the prices (in millions of dollars) in the following tables.

Table
Table

Xavie conducts a t-test, at the 5 % level of significance, to see if the mean apartment price in location A is different to the mean apartment price in location B . They assume the population variances are the same.

For this test, Xavie takes the null hypothesis to be μA=μB\mu_{A}=\mu_{B}.

[ 5 ]

Question 1(e)

(e)

Write down the alternative hypothesis.

[ 1 ]

Question 1(f)

(f)

Find the p-value for this test.

[ 2 ]

Question 1(g)

(g)

State the conclusion of the test. Justify your answer.

[ 2 ]

Question 1

[Maximum number: 6]

Eduardo believes that there is a linear relationship between the age of a male runner and the time it takes them to run 5000 metres.
To test this, he recorded the age, x years, and the time, t minutes, for eight males in a single 5000 m race. His results are presented in the following table and scatter diagram.

Table
Question image

Question 1(a)

(a)

For this data, find the value of the Pearson's product-moment correlation coefficient, r.

[ 2 ]

Question 1(b)

(b)

Eduardo looked in a sports science text book. He found that the following information about r was appropriate for athletic performance.

Table

Comment on your answer to part (a), using the information that Eduardo found.

[ 1 ]

Question 1(c)

(c)

Write down the equation of the regression line of t on x, in the form t=a x+b.

A 57-year-old male also ran in the 5000 m race.

[ 1 ]

Question 1(d)

(d)

Use the equation of the regression line to estimate the time he took to complete the 5000 m race.

[ 2 ]

Question 1

[Maximum number: 7]

The prices, in dollars, of 10 different garden chairs are:

7913925599502092291936949\begin{array}{llllllllll} 79 & 139 & 255 & 99 & 50 & 209 & 229 & 193 & 69 & 49 \end{array}

Question 1(a)

(a)

Find the range of the prices of the 10 chairs.

[ 2 ]

Question 1(b)

(b)

Use your graphic display calculator to find

[ 3 ]

Question 1(b)(i)

(i)

the mean price of the chairs.

Question 1(b)(ii)

(ii)

the standard deviation of the price of the chairs.

In a sale, the price of each of the 10 garden chairs is reduced by $ 20.

[ 3 ]

Question 1(c)

(c)

Write down

[ 2 ]

Question 1(c)(i)

(i)

the new mean.

Question 1(c)(ii)

(ii)

the new standard deviation.

[ 2 ]

Question 1

[Maximum number: 8]

The mean annual temperatures for Earth, recorded at fifty-year intervals, are shown in the table.

Table

Tami creates a linear model for this data by finding the equation of the straight line passing through the points with coordinates (1708,8.73) and (1958,9.45).

Question 1(e)

Question 1(e)(i)

(a)
(i)

Find the equation of the regression line y on x.

[ 3 ]

Question 1(e)(ii)

(ii)

Find the value of r, the Pearson's product-moment correlation coefficient.

[ 3 ]

Question 1(f)

(b)

Use Thandizo's model to estimate the mean annual temperature in the year 2000.

Thandizo uses his regression line to predict the year when the mean annual temperature will first exceed 15C15^{\circ} \mathrm{C}.

[ 2 ]
0 selected