Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
66 Cards in this Set
- Front
- Back
- 3rd side (hint)
STATISTICS
|
Technology that describes and measures aspects of nature from samples which quantifies uncertainty
|
Week 1
|
|
ESTIMATION
|
Process of inferring an unknown quantity of a target population using sample data.
|
Week 1
|
|
PARAMETER
|
A quantity describing a population, whereas an estimate is a related quantity calculated from a sample.
|
Week 1
|
|
HYPOTHESIS TESTING
|
The process of determining how well a "null" hypothesis about a population fits a sample of data. (null like no, often takes negative or skeptical view)
|
Week 1
|
|
Example of Interest: Concept Sampling Bias-Raining Cats or "Feline High-Rise Syndrome (injuries/stories fallen)
|
low number at 2 stories, more at 5 and surprisingly back to low 2 story level at 7 or more-cats at low end and high end don't go to vet, so sampling bias.
|
Week 1
|
|
POPULATION
|
The entire collection of individuals or units that a researcher is interested in studying.
|
Week 1
|
|
SAMPLE
|
Small set of individuals or units selected from population of interest for study.
|
Week 1
|
|
SAMPLING ERROR (one of two types of errors)-Bulls Eye - Grouped tightly-low error or dispersed- more error
|
Chance difference between an estimate and the population parameter being estimated.Directly related to accuracy.
(Inversely related to precision) |
Week 1
|
|
PRECISION OF AN ESTIMATE
|
The spread of estimates resulting from the sampling error
(Inversely related to sampling error) |
Week 1
|
|
BIAS (one of two types of errors)-Bull Eye - Mostly Centered so Precise or off Center so imprecise.
|
Systematic discrepancy between estimates an the true population characteristic.
|
Week 1
|
|
4 STEPS TO CREATING RANDOM SAMPLE
|
1.create list and give unit a number from 1 and the total population
2.n=? define Then 3. Use random number generator, gen. n integers bet w and the total population 4. Sample those units whose #s match the ones picked by random generator. |
Week 1
|
|
SAMPLE OF CONVENIENCE
DEFN: 3 EXAMPLES |
PG. 10
|
Week 1
|
|
GRAPHING CATEGORICAL DATA USING...
|
FREQUENCY TABLES AND BAR GRAPHS PG. 25
|
Week 1
|
|
GRAPHING NUMERICAL DATA USING...
|
HISTOGRAMS AND FREQUENCY TABLES
|
Week 1
|
|
5 CONCEPTS TO KEEP IN MIND TO DESCRIBE SHAPE OF HISTOGRAM
|
PG. 30
|
Week 1
|
|
MODE
|
PG 30
|
Week 1
|
|
INTERVAL WIDTH AND EXAMPLE
|
PG 31
|
Week 1
|
|
STURGES RULE FOR # OF INTERVALS
|
1+LN(N)/LN(2)
N IS NUMBER OF CHOICES AND LN IS THE NATURAL LOG (MOST ADD A FEW INTERVALS TO THIS NUMBER TO BE LESS CONSERVATIVE) |
Week 1
|
|
THE XTH PERCENTILE
|
Value under which X percent of the individuals lie or the X/100 quantile..the 50th percentile is referred to as the .50 quantile.
|
Week 1
|
|
4 STEPS TO CREATING RANDOM SAMPLE
|
1.create list and give unit a number from 1 and the total population
2.n=? define Then 3. Use random number generator, gen. n integers bet w and the total population 4. Sample those units whose #s match the ones picked by random generator. |
Week 1
|
|
Example of Interest: Concept Sampling Bias-Raining Cats or "Feline High-Rise Syndrome (injuries/stories fallen)
|
low number at 2 stories, more at 5 and surprisingly back to low 2 story level at 7 or more-cats at low end and high end don't go to vet, so sampling bias.
|
Week 1
|
|
RANDOM SAMPLE
(minimizing bias when done well and this minimizing makes possilbe meausre of sampling error) |
Each member of a population has an equal AND independent chance of being selected.
|
Week 1
|
|
INDEPENDENT
|
Adjective used to describe the sample unit. It is a condition that the selection of one unit/member of the population must NOT influence the selection of any other unit/member of the population
|
Week 1
|
|
How to take a random sample?
5 STEPS |
1LIST EVERY UNIT IN POP AND APPLY # 1 THROUGH TOTAL POP 2.DECIDE N=?
3.USE RANDOM NUMBER GENERATOR TO GENERATE THE N RANDOM VALUES BET 1 AND TOTAL # UNITS POP (OR GROUP WITHIN POP) 4. SAMPLE THE UNITS WHOSE #'S MATCH DESIGNATED COMPUTER SAMPLE REQUEST 5. |
Week 1
|
|
SAMPLE OF CONVENIENCE
|
Collection of individuals that are easily available to the researcher. (Injury Rate of Cats/Cod Fishery collapse and using estimate of population from sea only overestimated/Literary Digest Poll false projection of Election got pop from magazine lists (left out poor)/samples that might end up being friends
|
Week 1
|
|
Volunteer Bias
|
systematic difference bet/ pool of volunteers and population (polio vaccine more parents with kids without polio so infection rate higher than expected
|
Week 1
|
|
List several reasons for Volunteer Bias
|
1.healthier/proactive 2.low income if paid. 3. sicker if study has risk 4. telephone surveys attract people older and unemployed because home 4.more angry5.less prudish talk about sex
|
Week 1
|
|
2 Types of data and variables (Define)
|
1)Categorical (named characteristics of a population w/ out magnitude on a numerical scale ( can be scale/grouped data with numbers ) 2)Numerical (measurements are quantitative and have magnitude on numerical level).
|
Week 1
|
|
Categorical Data - 2 types
|
Nominal and Ordinal
|
Week 1
|
|
What are the two types of numerical Data -2 types
|
1) Discrete : come in individual counts
2) Continuous: numerical data take on any real number value within some range with infinite number of values possible. |
Week 1
|
|
Variables
|
Characteristics that differ among individuals
|
Week 1
|
|
Frequency Distribution
|
A distribution which describes the # of times each value of a variable occurs in a sample.
|
Week
|
|
Probability Distribution
|
A distribution of a variable in the whole population.
|
Week 1
|
|
normal Distribution
|
A distribution used for a continuous variable (like beak length) which approximates the frequency distribution occurring in real life.
|
Week 1
|
|
Experimental Study
|
A study for which the researcher assigns treatment randomly to individuals
|
Week 1
|
|
Observational Study
|
A study for which nature assigns treatment or values of an explanatory variable (not the researcher!)
|
Week 1
|
|
Response Variable
|
Variable being predicted
|
Week 1
|
|
Explanatory Variable
|
Variable being used to predict the response
|
Week 1
|
|
Experimental Studies
|
Researcher assigns subjects randomly to different treatments or groups
|
Week 1
|
|
Observational Studies
|
Researcher does not assign subjects to treatment but observe individuals to understand the affect treatment had on them.
|
Week 1
|
|
Relative Frequency Distribution
|
A distribution which describes the fraction of occurrences of each value of a variable.
|
Week 1
|
|
Frequency Table
(example: Causes of Teenage Death) |
Text display of number of occurrences of each category in the data set.
|
Week 1
|
|
Bar Graph
(example: Causes of Teenage Death-Accidents main one) |
Graph which uses the height of rectangular bars to display the frequency distribution ( or relative frequency distribution) of a categorical variable.
|
Week 1
|
|
Rules for Bar Chart Data
|
Ordinal Castagorical Data: present in natural order on horizontal axis (severity ex:minimal, moder...)
Nominal: frequency of occurrence ordered (descending...) |
Week 1
|
|
Two ways of representing frequency distributions
|
Histogram or Frequency Tables
|
Week 1
|
|
5 Descriptors of shape for histograms
|
1. mode (highest peak)
2.Bimodal (frequency dn w/ 2 peaks) 3.Symmetry (semetric-pattern to right or left of peak mirrors each other. 4. Skew-asymmetry in shape of a frequency distribution in s numerical variable (Left/Right) 5. Outliers-observation well outside the range of values of other observations in the data set. |
Week 1
|
|
Histogram Construction- 5 guidelines
|
1)Bar Spacing:Diff. from bar in that no separtion between adjacent bars (numerical scale reinforced)-2)Cut Off Value Decision (lower or higher bar when at cut off?)3)Sturges Rule4)use readable breakpoint numbers 5) label n=X in legend
|
Week 1: A) Displaying Data
1.Displaying Frequency Distributions |
|
Sturge's Rule of Thumb
|
intervals= 1+ln(n)/ln(2) with n=observations and ln=natural logarithm.*rounding up is traditional
|
Week 1: A) Displaying Data
1.Displaying Frequency Distributions |
|
Cumulative Frequency Distribution
|
x axis:Species Abundance and y axis:Cumulative Relative Frequenty (google moreexamples) Defined as graph of all quantiles of a numerical variable
+'s |
Week 1: Displaying Data
2.) Quantiles of a Frequency Distribution |
|
Cumulative Relative Frequency
|
fraction of observations less than or = to that same measurement pg 34
|
Week 1: B) Displaying Data
2.) Quantiles of a Frequency Distribution |
|
Contingency Tables
|
A frequency table for two or more categorical variables
(explanatory in column and response variable (predicted one) in rows. |
Week 1: Displaying Data
3) Associations bet/ Categorical Variables |
|
Grouped Bar Graph
|
Graph that uses the height of rectangular bars to display the frequency(or relative freq.) distribution of two or more categorical variables. (space explanatory variable(treatment/no treatment) and response (Malaria/No Malaria) ones more than bars between groups
|
Week 1: Displaying Data
3) Associations bet/ Categorical Variables |
|
Mosaic Plot
|
similar to grouped bar graph but stacked and only shows relative frequency and not true numerical number for frequencyNote: 1)width of bar is proportional to percent of n represented by the response variable 2) order bars for ordinal data
|
Week 1: Displaying Data
3) Associations bet/ Categorical Variables |
|
Compare Grouped bar (GB), contingency(C) and mosaic plot (MP)
|
GB>CT easier to compare between groups for bar height and area (yet not so true if multiple categories for variables.
|
Week 1: Displaying Data
3) Associations bet/ Categorical Variables |
|
Stacked Vertical Histograms
|
Compare histograms between groups with same scale.
Experiment: high alt>low O>high hemoglobin (binds with O so less O more H) Result: Only Andes not Ethiopia or Tibet pops showed increased hemoglobin so no physiological attributes presently universally compensating for high altitude. |
Week 1: Displaying Data
4)Comparing Numerical Variables bet/ Groups (only one variable need be numeric) 4A)Histograms |
|
Grouped Cumulative Frequency Distributions
|
Y Axis: Cumulative Relative Frequency X Axis: Measured Indicator (Response Variable?) pg. 40
|
Week 1: Displaying Data
4)Comparing Numerical Variables bet/ Groups (only one variable need be numeric) 4B)Comparing Cumulative Frequencies |
|
Scatter Plot (SP)
associations no magnitude or frequency |
Graphical Display of two numerical variables in which each observation is represented as a point on a graph with two axes.
|
Week 1: Displaying Data5) Displaying Relationships bet/ pair of numerical variables
5A)Scatter Plot |
|
Line Graph Described/Compared to SP
Famous Example: Cyclic Fluctuations in Lynx Numbers (Hudson Bay Company from 1752-1819) |
Dots connected by line segments to display trends in time or other ordered series .Different than scatter plot as only one y measurement appears for a particular x-observation.
|
Week 1: Displaying Data
5) Displaying Relationships bet/ pair of numerical variables 5B)Line Graph |
|
Map
|
spatial equivalent of a line graph which displays a numerical response measurement at multiple locations on a surface
|
Week 1: Displaying Data
5) Displaying Relationships bet/ pair of numerical variables 5C)Maps |
|
5 Principles of Graphical Display
|
1. Show the data (Bees-curve and dot)
2.represent magnitudes properly(Educational Spending Bar Graph 50 % visual but 20 % actual) 3.minimize clutter 4. maximize ease of interpretation (color/ shapes/labels) 5. use clear graphical elements |
Weel 1: Week 1: Displaying Data
5) Principles of Effective Display |
|
Graph vs. Tables
|
Graphs show pattern/exceptions
Tables show more quantitative aspects of the data |
Week 1: Displaying Data
5) Principles of Effective Display |
|
Sample Mean
|
Average measurement of the sample
Sum of all the observations divided by number of observations |
Week 1: Describing Data
1. Arithmetic Mean & Standard Deviation |
|
Standard Deviation
|
X
|
Principles of Effective Display
1. Arithmetic Mean & Standard Deviation |
|
Variance
|
X
|
Principles of Effective Display
1. Arithmetic Mean & Standard Deviation |
|
Sum of Squares
|
x
|
Principles of Effective Display
1. Arithmetic Mean & Standard Deviation |
|
Coefficient Variation
|
X
|
Principles of Effective Display
1. Arithmetic Mean & Standard Deviation |