Thread Rating:
Statistics & Probability Essentials — Data, Averages, Chance & Distributions
#1
Statistics & Probability Essentials — Data, Averages, Chance & Distributions

Statistics and probability help us understand data, make predictions, and measure uncertainty. 
From science and business to medicine and machine learning — these ideas appear everywhere.

This thread introduces the fundamentals in a simple, beginner-friendly way.

-----------------------------------------------------------------------

1. Types of Data

Qualitative (categorical): 
• colours 
• names 
• types 

Quantitative (numerical): 
Separated into:

Discrete data (whole numbers) 
Continuous data (any value in a range)

Examples: 
• number of goals scored → discrete 
• height, weight, temperature → continuous 

-----------------------------------------------------------------------

2. Averages (Measures of Central Tendency)

Mean: 
Add all values, divide by number of values.

Median: 
Middle value when data is ordered.

Mode: 
Most frequent value.

Range: 
Difference between max and min.

Example: 
Data: 4, 7, 9, 9, 12 
• mean = 41 / 5 = 8.2 
• median = 9 
• mode = 9 
• range = 12 − 4 = 8

-----------------------------------------------------------------------

3. Frequency Tables

Example:

| Score | Frequency |
|-------|-----------|
|  1  |    3    |
|  2  |    7    |
|  3  |    5    |

Total responses = 3 + 7 + 5 = 15

Mean from table:

Code:
(1×3 + 2×7 + 3×5) / 15 = 2.13

-----------------------------------------------------------------------

4. Representing Data

• bar charts 
• pie charts 
• histograms 
• line graphs 
• scatter graphs 

Scatter graphs: 
Used to show correlation.

Types of correlation:
• positive 
• negative 
• none 

-----------------------------------------------------------------------

5. Probability Basics

Probability always lies between 0 and 1.

0 = impossible 
1 = certain

Probability of an event:

Code:
P(event) = number of favourable outcomes / total outcomes

Example:
Rolling an even number on a dice = 3/6 = 1/2

-----------------------------------------------------------------------

6. Mutually Exclusive Events

If events cannot happen at the same time:

Code:
P(A or B) = P(A) + P(B)

Example:
Rolling a 1 or a 6 = 1/6 + 1/6 = 2/6

-----------------------------------------------------------------------

7. Independent Events

Events that do NOT affect each other:

Code:
P(A and B) = P(A) × P(B)

Example:
Flipping two coins: 
P(Heads then Heads) = 1/2 × 1/2 = 1/4

-----------------------------------------------------------------------

8. Conditional Probability (Simple Version)

Probability of A happening given B has already happened:

Code:
P(A|B)

Example: 
If 3 cards are red out of 5 total and you remove one red, the probabilities change.

-----------------------------------------------------------------------

9. Distributions (Beginner Overview)

Uniform distribution: 
All outcomes are equally likely (e.g., dice roll).

Normal distribution (bell curve): 
Real-life measurements often follow this:
• height 
• test scores 
• measurement errors 

Characteristics:
• symmetric 
• mean = median = mode 
• 68% of data within 1 standard deviation 
• 95% within 2 SD 

-----------------------------------------------------------------------

10. Standard Deviation (Spread of Data)

Measures how much data varies from the mean.

Small SD → data close together 
Large SD → data spread out 

Simple example:

Data: 8, 9, 10 
Mean = 9 
SD is small because all values are close.

-----------------------------------------------------------------------

11. Common Mistakes

❌ Thinking probability can be over 1 
✔ It must be between 0 and 1

❌ Forgetting to divide by total outcomes 
✔ Always count the full set

❌ Confusing independent with mutually exclusive 
✔ independent = do not affect each other 
✔ mutually exclusive = cannot both happen

❌ Mixing histograms with bar charts 
✔ histograms = continuous data 
✔ bars touch

❌ Thinking the mean is always best 
✔ median is better for skewed data

-----------------------------------------------------------------------

12. Practice Questions

1. Calculate the mean of: 5, 12, 7, 7, 9 
2. A bag has 3 blue, 5 red, 2 green. P(red)? 
3. Roll two dice. P(sum = 8)? 
4. Data: 14, 18, 22, 25, 27. Find median. 
5. Identify the type of correlation for points trending upward on a scatter graph. 
6. In a class test: mean = 60, SD = 2. Which class is more consistent: SD = 2 or SD = 10?

-----------------------------------------------------------------------

Summary

This post covered:
• types of data 
• averages 
• frequency tables 
• probability rules 
• independence 
• conditional probability 
• distributions 
• standard deviation 
• practice questions 

Statistics & probability help us understand uncertainty, patterns, and real-world behaviour — essential for science, computing, economics, and research.
Reply
« Next Oldest | Next Newest »


Forum Jump:


Users browsing this thread: