Monday, November 28, 2011

Standard Deviation on a Calculator

Hey y'all --

So in 5th period we looked at how to calculate standard deviation on a calculator. This video is really helpful and walks you through the process step by step.
Incidentally, you end up finding Std Dev, Variance, Q1, Med, Q3, Mean, Min, and the Max so this is REALLY helpful.


Enjoy!

Madison

Bad Surveying in The Fourth Grade


At the Friends School, where I attended pre, elementary, and middle school, there were annual science fairs. When I was in the 4th grade I did an experiment to find out if organic flowers smelt better then non-organic flowers. I bought some flowers that were not organic and then took some organic flowers that had been sitting on the windowsill. I blindfolded the people who I surveyed, and let them smell each. My hypothesis was that the organic ones would smell better, and it was confirmed.

I remember surveying my dad first. After I gave him the non-organic ones and then gave him the organic flowers I introduced the organic flowers with a great amount of lavish: “now smell this one!” After that my mom told me that if I introduced either flower differently I might influence the answers of the people being surveyed. In other words, my asking the question in such a way that implicitly favored one of the answers introduced bias into the survey.

There were also several other problems with the survey, some of which I knew at the time, others I didn’t. One problem, which my mom had also mentioned to me at the time (she said that if I included this in my final project my teachers would be impressed) was that the sample size was relatively small and—she didn’t word it like this—that my results were therefore not statistically significant.

Another problem with my survey is that there may have been confounding variables as the organic flower had been grown at home and I didn’t know much about the store-bought non-organic flowers.

Unit 3 Test Review Sheet Chapters 23, 5, and 6

Disclaimer:  This review sheet does not cover everything you should know.  It's just a helpful guide of some of the problems you might see.  To ensure complete preparation you should be reviewing your in class and out of class notes, as well as, your previous quizzes and tests.

Chapter 23

1.A report by MM Shaheen, a member of the Parliament of the Peoples' Republic of Bangladesh, reported the population in 2002 to be approximately 110 million with a 1.8% annual growth rate. What is the anticipated population in 2006?

2. In the U.S. Department of Energy's (DOE) Energy Information Administration's (EIA) International Energy Outlook 2002, it was estimated that the United States had a 60-year supply of recoverable natural gas. Approximately how long will the supply last if the total demand for natural gas increases at an average rate of 1.8% ? 

3. Attorney Gianetti retired with $2 million in a non interest-bearing savings account. The attorney figured that it would cost him $80,000 per year to live at his current standard of living. Assuming a constant 3.5% per year inflation rate, how long will his savings last?

4.  Name a few nonrenewable resources.

Chapter 5

5.  A Fast Food Company was interested in knowing whether their customers were satisfied with the overall service and cleanliness of the Company's franchises. In an effort to obtain this information, The Fast Food Company randomly selected 75 of the 325 customers from one of their 25 franchise stores to fill out a survey.  What is the sample in this situation?

6. On the final episode of the popular "Dancing with the Stars" show, viewers were asked to call in and vote for their favorite star. This is an example of what type of sampling?

7.  In an effort to determine why contract negations broke down resulting in a devastating long term strike, management formed a task force to randomly select and interview 50 of the 825 employees. The three-digit employee ID and Table 7.1 from your text was used to identify which of the employees were interviewed. Lines 106-109 of Table 7.1 are reproduced below. What would be the ID numbers of the first 15 employees selected?



8.  What do random samples seek to eliminate?

9.  A researcher administers a new migraine headache medication to a group of volunteers in order to observe whether the medication abated the intensity of headache. This is an example of what type of survey?

10.  Suppose 65% of all college students find studying for final exams a waste of time. The population proportion is p = 0.65. Suppose many different simple random samples of 3,000 college students were taken. What would be the mean of the sampling distribution?

11.  The CDC took a random sample of 530 people that lived near high voltage towers. Of these people, they found that 345 developed some form of cancer. Give a 95% confidence statement for the proportion p of all people who live near high voltage towers and develop cancer.

Chapter 6

Thanks to Marc K.

1. What are individuals? What are variables? How are they related?
2. What is distribution and how is it shown in a histogram?
3. What can be used to describe the overall pattern of a histogram?
4. What is an outlier and how does it differ from deviation?
5. How do you make a stemplot and how is it useful?
6. How do you find the mean of a set of data, and how does it differ from the median? Which might be a more accurate representation of the center of the data and why?
7. What are the 5 numbers of a 5 number summary?
8. How is a boxplot made? Why is it useful?
9. What do histograms show that boxplots do not?
10. What is the explanatory variable? What is the response variable?
11. What is a scatterplot? Why are these used?
12. How do you describe the overall pattern of a scatterplot?
13. What are outliers and how do they effect the line of best fit, median, mean, and quartiles?
14. What is a regression line? Why do outliers effect it?
15. What is correlation? What causes a lower correlation? A higher correlation? What is the highest possible correlation?
16. What is a least squares regression line?

Important Terms to Know Review:

Chapter 23

Nonrenewable resources
Renewable resources
Static reserve*
Exponential reserve
Population
Growth rate
Maximum sustainable yield **
Reproduction curve

Chapter 5
--Producing data--

Population
Sample
Simple random sample
Types of samples
-Bad samples
-Good samples
Margin of error
Experiments
Observational studies

Chapter 6

Histograms
5-number summary
Mean
Median
Correlation
Association
Box plot
Stem plot
Scatter plot
Standard deviation

Variance
Smoothing

Outliers and their impact on Mean, Median, and Regressions
Regression
Least Squares Line

Tuesday, November 15, 2011

Kenan Scribe - Box Plots

A box plot by definition is a graphical way of depicting numerical values through their five number summaries. So what does this mean? Let's say for example that I were to hand you 9 balls, each with a random number written on the sides. A box plot would be an easy way to show a bit about the numbers you have received in a neat graphical form.

Five number summaries: Every box plot consists of five unique points, the minimum, the first quartile, the median (or second quartile), the third quartile, and the maximum. After arranging the numbers you receive from smallest to largest you can then begin to decide which numbers are what.

The minimum is simply the smallest number in the data set.

The median is the number directly in the middle of the series. If there are an odd number of numbers then you will not be able to select one single number in the middle. You will take the middle two, add them together and divide the sum by 2, thus finding this average (or the median). This will be the median of your data set.

The maximum is the largest number in your data set, the opposite of the minimum.

The two quartiles are slightly more tricky, but still very simple to find. For the first quartile, look to all the numbers to the left of the median. Find the median of this new data set and you will have your first quartile for all of your numbers. This works inversely with the third quartile. Look to the right of the median, find the median of these numbers and you will have your third quartile. But why do you go from the first directly to third quartile you ask You want to know where the second quartile is? Well, if the data set is to be broken into quarters, there will be four parts of it. A quarter is a fourth. The second quartile will be the number directly in the middle, so we have already found it. The second quartile is the median!

REMEMBER! Box plots must always be drawn along a number line!

Let's look at a box plot, shall we?




These are three very simple horizontal box plots. As you can see, there appears to be lines coming from the box shape we had all assumed would be the graph. These are known as whiskers. At the far left end lies the minimum of the data, at the far right, the maximum. The box starts on the left on the first quartile. The box ends on the third quartile and the whisker that leads to the maximum begins. The median is the line down the middle of the box plot.


Here is a data set of 7 random numbers I just thought of. I have put them in order for you already.

2, 4, 5, 9, 13, 16, 17.

What is the minimum?
What is Q1? (first quartile)
What is the median?
What is Q3? (third quartile)
What is the maximum?

Minimum: 2
Q1: 4
Median: 9
Q3: 16
Maximum: 17

If you were to make this into a box plot, what would it look like?

Sadly, I can't draw a box plot in computer speak so I'm going to tell you what it would look like. The box plot would be set on a number line from 0 to 20, just because that would look nice and encompass all the numbers in our data set. The left whisker would begin at 2 and then continue until the side of the box began at 4. The box would end at 16. There will be a line through the box at 9 to denote the median. The right whisker will begin from the side of the box and go until 17. This is a fairly jacked up box plot but that's what happens when I make up a bunch of random numbers. Box plots are an easy and informative way to collect vital data on an otherwise overwhelming set of numbers.






Madison Scribe Post 11/14


Histograms and Stemplots

Yesterday in class we started discussing Histograms and Stemplots.

Histograms

Histograms display the frequency of a variable.



The x-axis of a histogram displays the amount of variables.
Histograms use bins to group data on an x-axis.

In this case, our individuals (or objects that are being described in a data set) are the black cherry trees. This histogram is looking at the height of the cherry trees, otherwise known as a variable, or
a characteristic of an individual.
Instead of having a bar graph, where each individual gets a bar like the graph below, histograms
group the individuals into categories. For example, black cherry trees with heights of 61ft, 62ft and 64ft would be grouped in the "60-64.9ft" bin on the graph.

EXAMPLE OF A BAR GRAPH:

We use bins because we can fit more data onto a histogram that way. If we created a bar graph of the heights of cherry trees it would need an enormous x-axis to fit all of our data.

The y-axis of a histogram represents the frequency of the variables.
So back to our cherry tree example, if you look at the graph, there were 10 cherry trees that were between 75-79.9 feet tall. It's pretty straightforward.

We then looked at the shape of a histogram.

If the histogram has a symmetric "hill" (5th period whaddup) the data is awesome, and nothing
seems to be wrong with it.










If the histogram has a
longer tail of information to the right, then the data is skewed to the right.













If the histogram has a longer tail of information to the left, the data is skewed to the left.









and finally...Outliers are distinct measurements that are separated from the rest of the data.

I found a website that lets you practice creating a histogram, for those of you still confused. Enjoy!


Stemplots
- make data easily presentable
- make plotting decimals easier
- you can display the exact measurement of each individual

A stemplot has two sections called the stem and leaves instead of a y-axis and x-axis.


To read this graph you would read 3|9 as 39 points were scored in a game.
4|0 1 5 7 7 7 9 reads as 40, 41 45... etc.

When making a stemplot the stems must be arranged vertically by numerical order from smallest to largest and the leaves must be arranged horizontally in order from smallest to largest in each stem category.

Leaves can ONLY BE ONE DIGIT. So if you wanted to represent a number like 356 you would make you stem 35 and your leaf 6. It would end up looking like this: 35|6

We also discussed finding the mean and median for the stemplots.

To find the mean (or average) all all of the individuals up and divide by the # of individuals.
We used the symbol of to represent the mean. The Σ (sigma) means "the sum of"
=(1/n)Σ
or basically: mean= (x1+x2+x3...)/(n)

The median is represented as m=(n+1)/(2) or the total number of individuals plus one, divide by two.

When you're finding the median on a stemplot you take the total number of individuals on a stemplot (if we're using the above stemplot that's 12) and apply it to the formula getting us the number 6.5. That is not our median. To find our actual median, we have to find individuals 6 and 7 on our stemplot. So on our graph above individual six has a value of 47, and individual seven also has a value of 47. Now we take the average of those two numbers, which ends up also being 47, and now you have the actual median.

It's a bit confusing, I know. But here's a link for extra clarification!


That's the gist of what we learned yesterday. It's a bit long, but hopefully it helps. Can't wait for Kenan's amazing scribe post....

Madison

P.s. I don't know why some of my words hilighted....sorry...!

Monday, November 14, 2011

Histogram Help

For those of you needing help on making histograms in Microsoft Excel I found a really handy website that explains a step by step process on how to make a histogram.

Click Here for the website!

Helpful tips:
You don't need to place your bins next to the values that they represent in your data. I find that confusing. You can place your bins anywhere in the spreadsheet.

When actually selecting your options for the histogram, ignore the output range. You don't need it. Nor do you need the "pareto" option nor the "cumulative percentage" option.

Make sure you check "Chart Output" or else you won't actually get a physical histogram.

Your histogram might pop up in a new page in excel. I freaked out because I thought all of my data had been erased because of that. If it happens to you, just know it's still there.


Finally...
I'm the scribe for tonight but I accidentally left my notes at school. So I'll be getting that up tomorrow. But just to keep this going, tomorrow's scribe is Kenan!!! (YAY)

Happy histograming!

Histograms & Stem Plot Assignment

Do and share on googledocs with your name and the title.

Create a Histogram that explores information that interests you.

Create a Stem Plot that explores information that interests you.

Categorize the data, if its too time consuming to explore it all.

Answer the following questions concerning both your histogram and stem plot.

What is the mean of the data?

What is the median of the data?

For each plot state the distribution (skewed left, skewed right, symmetric)?

Is there a statement you would like to make about your data?  What did you find to be true or common about the data that you have explored?

Thursday, November 10, 2011



Sir Ronald A. Fisher


Throughout Sir Ronald’s life he broke many new mathematical frontiers. He invented systematic mathematical theories and improved on the ones that were already in place. Fisher had a happy childhood in East Finchley, London England, the youngest of several brothers and sisters. He avidly studied in school, constantly striving to gain more knowledge of the scientific and mathematical worlds. Fisher possessed special abilities in mathematics due to his poor eyesight that both helped and hindered him. Throughout school, because of his inability to see clearly, Fisher intensely studied math without the use of pen or paper. Fisher never practiced the discipline of writing out his s
teps or writing proofs, which would hinder his communication with other mathematicians in the future, but learning this way it enabled him to view math and it’s relationship to the physical world in a different way than his peers.
Throughout his academic career Fisher astounded his teachers and classmates with his intelligence and innovation. Fisher was eager to join the army and head into WWI but because of his poor eyesight he was not allowed to join, and forced to stay home where he was able to focus on his studies. Unfortunately, Fisher had a heavy interest in eugenics, which was spurred by his interest in Mendelion theories of genetics. Fisher headed many clubs on the study and though the word has poor connotations, he did not see it as a philosophy to be applied to humans but rather to plant populations. His was interested in the randomness of the genetic make-ups and phenotypic natures of plants grown under different conditions/ factors. Using agricultural studies, Ronald Fisher developed new techniques that won him the title of the “Father of Statistical Math’s”. In relationship to what we will learn in class, Ronald A. Fishers invention of randomized testing techniques are his most important development.

Throughout Sir Ronald’s life he broke many new mathematical frontiers. He invented systematic mathematical theories and improved on the ones that were already in place. Fisher had a happy childhood in East Finchley, London England, the youngest of several brothers and sisters. He avidly studied in school, constantly striving to gain more knowledge of the scientific and mathematical worlds. Fisher possessed special abilities in mathematics due to his poor eyesight that both helped and hindered him. Throughout school, because of his inability to see clearly, Fisher intensely studied math without the use of pen or paper. Fisher never practiced the discipline of writing out his steps or writing proofs, which would hinder his communication with other mathematicians in the future, but learning this way it enabled him to view math and it’s relationship to the physical world in a different way than his peers.
In his academic career Fisher astounded his teachers and classmates with his intelligence and innovation. Fisher was eager to join the army and head into WWI but because of his poor eyesight he was not allowed to join, and forced to stay home where he was able to focus on his studies. Unfortunately, Fisher had a heavy interest in eugenics, which was spurred by his interest in Mendelion theories of genetics. Fisher headed many clubs on the study and though the word has poor connotations, he did not see it as a philosophy to be applied to humans but rather to plant populations. His was interested in the randomness of the genetic make-ups and phenotypic natures of plants grown under different conditions/ factors. Using agricultural studies, Ronald Fisher developed new techniques that won him the title of the “Father of Statistical Math’s”. In relationship to what we will learn in class, Ronald A. Fishers invention of randomized testing techniques are his most important development.



Wednesday, November 9, 2011

Scribe Post 11/9/11 by Carly

Today in class, we went over sections 5.5 Estimation and 5.6 Randomized Comparative Experiments. We learned that experiments study the response to a stimulus, to see how one variable affects another when we change existing conditions.
The two different studies we learned about were observational studies and experiment studies. Observational studies observe individuals and measure variables of interest but do not attempt to influence the
responses. The purpose of observational studies are to describe some group or situation, for example, a sample survey.
Experiment studies deliberately impose treatment on individuals in order to observe
their responses. The purpose of experiments are to study whether the treatment causes a change in the response.

Types of Experiments:
An uncontrolled experiment is an experiment where different variables may impact the experiment. For example, suppose two different groups of people, people who took online SAT classes rather than in-class SAT classes, and the experiment concluded that the people who took the online SAT classes did better on the SAT than those who took the in-class SAT classes. This is an uncontrolled experiment because many factors could impact the outcome of t
his experiment, for example, the people who took
online SAT classes could have been older, more experienced people who may have already taken the SAT, and the people who took the in-class SAT classes could have been younger people in high school. This would be a bias experiment, and not accurate.





A Randomized Comparative Experiment strives to get rid of the bias by randomizing which stimulus to apply to certain groups. For example, say we randomly took all of the people who intended to take the SAT classes (online and in-class) and mixed them up into the two classes, without knowing whether they would have originally taken the online classes or the in-class classes. This would make the experiment more accurate, because there would be no bias to this.

A control group in an experiment is the group that nothing is done to. The control is left alone so that the other variables may be compared to it. The control is the variable that is usually the "normal" result and that which you are testing against. For example, say we were testing whether the temperature has an effect on the breaking point of rubber bands. We took one rubber band and put it in the freezer, another rubber band and heated it, and left another rubber band at room temperature. The room temperature rubber band would be t
he control in the experiment, because it is the one that was not affected by any variables. This is also another example of an uncontrolled experiment, because different factors could have affected the out come of this experiment.

The last experiment we talked about was a Double Blind Experiment. This is an experiment where one does not know the difference between the stimulus and the control. For example, say a double-blind experiment was performed at a barbecue,
and Bob is trying to see how many people prefer Texas Pete over his own home made hot sauce. He gets his friend Billy to randomly put the hot sauces in two different bowls, labeling them A and B. Bob then takes the bowls and asks people which they prefer, and records the results. At the end of the experiment, Billy
tells Bob which hot sauce went in which bowl, and people ended up liking Texas Pete better. Poor Bob. This was a double blind experiment because Bob did not know which hot sauce was his, therefore he could not persuade the people in any way.



A good explanation of a double blind experiment is in the video below which we watched today, called "Scientific Method: How Double Blind Clinical Trials Are Done"

Next scribe is Farlz :) (Madison)

Tuesday, November 1, 2011