### All entries categorized “statistics”

Tuesday, Jan. 6th, 2015 2:35p.m.

Over the weekend, jeremy posted about beckieball, a "new sport sweeping the country." The purpose was to show how selection on characteristics affects the correlation between characteristics upon selection. This, as commenter Stuart Buck pointed out, is an example of Berkson's Paradox, though it relates to jeremy's post about height and nba.

Although he left several other exercises to the reader, I thought I would do a simpler one: recreate the code that he used to make his example. I did this a) because it was a semi-useful way to shake the cobwebs from egg nog and yuletides, and b) because I think that it will come in handy teaching someday.

«read more»

tags:
Stata,
statistics
categories:
Programming
&
Statistics

Monday, Oct. 18th, 2010 6:31p.m.

I am currently preparing a proposal for submission and one piece of information that the agency suggests is the power required to distinguish effects. This is obviously a perfectly reasonable piece of information to request; however, power calculations fall into that class of things that I know that I should know but I don't. It is one of those topics that every statistics book will tell you is important, but either a) glosses over the topic, or b) provides such a deep background that it is impossible to follow what the authors are talking about. Additionally, power calculations are complicated enormously by the fact that sample designs can become very complicated.

In contrast to this traditional treatment, Andrew Gelman and Jennifer Hill's book, Data Analysis Using Regression and Multilevel/Hierarchical Models, provides a very clear description of simple power analyses, which -- thankfully -- is all that I really need for this project. To make sure that I don't forget, I record below how to find the required sample size, *n*, for varying levels of between-group effect differences, Δ, at 80% power. The formula is relatively easy (see pp. 437-447 for more info): (5.6σ/Δ)^{2}. Therefore, if I measure change in units of standard deviations, `sd`

, then I can estimate the sample size `n`

for each unit of change.

```
drop _all
range sd 0 1 41
gen n = (5.6/sd)^2
```

I can then make a graph of the expected sample size required for a standard unit change using the command `twoway line n sd`

; or, alternatively, just print a table of numbers using `list`

.

tags:
research-design,
statistics
category:
Programming