Photo credit: Stephen M. Scott
+   -  text size:

## Blog

### All entries categorized “statistics”

#### beckieball, or selecting on skill

Tuesday, Jan. 6th, 2015 2:35p.m.

Over the weekend, jeremy posted about beckieball, a "new sport sweeping the country." The purpose was to show how selection on characteristics affects the correlation between characteristics upon selection. This, as commenter Stuart Buck pointed out, is an example of Berkson's Paradox, though it relates to jeremy's post about height and nba.

Although he left several other exercises to the reader, I thought I would do a simpler one: recreate the code that he used to make his example. I did this a) because it was a semi-useful way to shake the cobwebs from egg nog and yuletides, and b) because I think that it will come in handy teaching someday.

tags: Stata, statistics categories: Programming & Statistics

#### Calculating Simple Power Analyses

Monday, Oct. 18th, 2010 6:31p.m.

I am currently preparing a proposal for submission and one piece of information that the agency suggests is the power required to distinguish effects. This is obviously a perfectly reasonable piece of information to request; however, power calculations fall into that class of things that I know that I should know but I don't. It is one of those topics that every statistics book will tell you is important, but either a) glosses over the topic, or b) provides such a deep background that it is impossible to follow what the authors are talking about. Additionally, power calculations are complicated enormously by the fact that sample designs can become very complicated.

In contrast to this traditional treatment, Andrew Gelman and Jennifer Hill's book, Data Analysis Using Regression and Multilevel/Hierarchical Models, provides a very clear description of simple power analyses, which -- thankfully -- is all that I really need for this project. To make sure that I don't forget, I record below how to find the required sample size, n, for varying levels of between-group effect differences, Δ, at 80% power. The formula is relatively easy (see pp. 437-447 for more info): (5.6σ/Δ)2. Therefore, if I measure change in units of standard deviations, `sd`, then I can estimate the sample size `n` for each unit of change.

``````drop _all
range sd 0 1 41
gen n = (5.6/sd)^2
``````

I can then make a graph of the expected sample size required for a standard unit change using the command `twoway line n sd`; or, alternatively, just print a table of numbers using `list`.

tags: research-design, statistics category: Programming

#### Front Page

• Information about the purpose and topics of this blog can be found here.

#### Miscellany

• The views presented here are solely and entirely my own, they do not represent those of my colleagues, employer, or any funding agencies which may support me.
• The writing on this blog is covered by a Creative Commons License (described here). Feel free to distribute or re-post with a link to the original content provided that it is freely available to others.