![]() |
|
Statistics & Probability Essentials — Data, Averages, Chance & Distributions - Printable Version +- The Lumin Archive (https://theluminarchive.co.uk) +-- Forum: The Lumin Archive — Core Forums (https://theluminarchive.co.uk/forumdisplay.php?fid=3) +--- Forum: Mathematics (https://theluminarchive.co.uk/forumdisplay.php?fid=6) +---- Forum: Statistics & Probability (https://theluminarchive.co.uk/forumdisplay.php?fid=18) +---- Thread: Statistics & Probability Essentials — Data, Averages, Chance & Distributions (/showthread.php?tid=91) |
Statistics & Probability Essentials — Data, Averages, Chance & Distributions - Leejohnston - 11-13-2025 Statistics & Probability Essentials — Data, Averages, Chance & Distributions Statistics and probability help us understand data, make predictions, and measure uncertainty. From science and business to medicine and machine learning — these ideas appear everywhere. This thread introduces the fundamentals in a simple, beginner-friendly way. ----------------------------------------------------------------------- 1. Types of Data Qualitative (categorical): • colours • names • types Quantitative (numerical): Separated into: • Discrete data (whole numbers) • Continuous data (any value in a range) Examples: • number of goals scored → discrete • height, weight, temperature → continuous ----------------------------------------------------------------------- 2. Averages (Measures of Central Tendency) Mean: Add all values, divide by number of values. Median: Middle value when data is ordered. Mode: Most frequent value. Range: Difference between max and min. Example: Data: 4, 7, 9, 9, 12 • mean = 41 / 5 = 8.2 • median = 9 • mode = 9 • range = 12 − 4 = 8 ----------------------------------------------------------------------- 3. Frequency Tables Example: | Score | Frequency | |-------|-----------| | 1 | 3 | | 2 | 7 | | 3 | 5 | Total responses = 3 + 7 + 5 = 15 Mean from table: Code: (1×3 + 2×7 + 3×5) / 15 = 2.13----------------------------------------------------------------------- 4. Representing Data • bar charts • pie charts • histograms • line graphs • scatter graphs Scatter graphs: Used to show correlation. Types of correlation: • positive • negative • none ----------------------------------------------------------------------- 5. Probability Basics Probability always lies between 0 and 1. 0 = impossible 1 = certain Probability of an event: Code: P(event) = number of favourable outcomes / total outcomesExample: Rolling an even number on a dice = 3/6 = 1/2 ----------------------------------------------------------------------- 6. Mutually Exclusive Events If events cannot happen at the same time: Code: P(A or B) = P(A) + P(B)Example: Rolling a 1 or a 6 = 1/6 + 1/6 = 2/6 ----------------------------------------------------------------------- 7. Independent Events Events that do NOT affect each other: Code: P(A and B) = P(A) × P(B)Example: Flipping two coins: P(Heads then Heads) = 1/2 × 1/2 = 1/4 ----------------------------------------------------------------------- 8. Conditional Probability (Simple Version) Probability of A happening given B has already happened: Code: P(A|B)Example: If 3 cards are red out of 5 total and you remove one red, the probabilities change. ----------------------------------------------------------------------- 9. Distributions (Beginner Overview) Uniform distribution: All outcomes are equally likely (e.g., dice roll). Normal distribution (bell curve): Real-life measurements often follow this: • height • test scores • measurement errors Characteristics: • symmetric • mean = median = mode • 68% of data within 1 standard deviation • 95% within 2 SD ----------------------------------------------------------------------- 10. Standard Deviation (Spread of Data) Measures how much data varies from the mean. Small SD → data close together Large SD → data spread out Simple example: Data: 8, 9, 10 Mean = 9 SD is small because all values are close. ----------------------------------------------------------------------- 11. Common Mistakes ❌ Thinking probability can be over 1 ✔ It must be between 0 and 1 ❌ Forgetting to divide by total outcomes ✔ Always count the full set ❌ Confusing independent with mutually exclusive ✔ independent = do not affect each other ✔ mutually exclusive = cannot both happen ❌ Mixing histograms with bar charts ✔ histograms = continuous data ✔ bars touch ❌ Thinking the mean is always best ✔ median is better for skewed data ----------------------------------------------------------------------- 12. Practice Questions 1. Calculate the mean of: 5, 12, 7, 7, 9 2. A bag has 3 blue, 5 red, 2 green. P(red)? 3. Roll two dice. P(sum = 8)? 4. Data: 14, 18, 22, 25, 27. Find median. 5. Identify the type of correlation for points trending upward on a scatter graph. 6. In a class test: mean = 60, SD = 2. Which class is more consistent: SD = 2 or SD = 10? ----------------------------------------------------------------------- Summary This post covered: • types of data • averages • frequency tables • probability rules • independence • conditional probability • distributions • standard deviation • practice questions Statistics & probability help us understand uncertainty, patterns, and real-world behaviour — essential for science, computing, economics, and research. |