PSYCHOMETRICS: What is the difference between disproportionate and proportionate stratified random sampling?

In stratified random sampling, you divide the population in sub-populations, or strata. Disproportionate stratified random sampling refers to when you take an equal-sized random sample from each strata. Proportionate stratified random sampling refers to taking samples where the size of the sample is proportionate to the size of the strata. If you are very concerned about giving a voice to everyone, you should use disproportionate SRS. However, if you would like to make accurate predictions, it is best to use proportionate SRS.

PSYCHOMETRICS: What is a Guttman scale?

A Guttman scale is a scale in which agreeing with an extreme statement indicates agreement with milder, related statements. If a respondent agrees with an extreme statement, such as “I love everything about pencils”, it can be assumed that the respondent would also agree to statements such as “I like pencil lead”, “I like pencil erasers”, etc. (weird example, but you get the gist). Due to these assumptions, more items can be answered in less time.

PSYCHOMETRICS: What are the assumptions of classical test theory? (Hint: there are four)

The assumptions of Classical Test Theory are that 1. Measurements errors are random, 2. Mean error of the measurement is zero, 3. True scores and errors are uncorrelated, and 4. Errors on different tests are uncorrelated. Some of these assumptions are problematic, however. For example, the assumption that true scores and errors are uncorrelated. This may be problematic because we are probably better at getting average than extreme outliers.

PSYCHOMETRICS: In the equation X = T + e, what do the various components mean. Explain your answer, don’t just give me the name of the term.

This is true score theory. X represents the observed score, T represents the true score, and e represents the measurement error. The equations means that the observed score is a function of the true score plus the measurement error. However, we can never truly know what a person’s “true” score is- it is theoretical. This could only be found by having someone take a test an infinite number of times and then averaging those scores. Measurement error is anything other than your true score that contributes to your observed score. We assume these are random and normally distributed.

PSYCHOMETRICS: When and why do we use the Spearman-Brown correction?

We use the Spearman-Brown correction when we are doing split-half to check for internal consistency. Because we are dividing the test into two halves and calculating a reliability score for each half, our half test reliability will be off from the full test reliability and the prediction will not be fully accurate. The Spearman-Brown formula corrects for this and raises our half test reliability.

PSYCHOMETRICS: Explain item difficulty and item discrimination in classical test theory.

Item difficulty describes at what difficulty level the items of a test are written, which can be measured by how many respondents are getting the item correct/incorrect. You want to make sure you pay attention to item difficulty, in order to make sure you still have item discrimination. Item discrimination means that you can identify respondents based on who scored high, low, medium, etc. Therefore, if items are too difficult and everyone is failing, you have low discrimination and you cannot differentiate between respondents. Likewise, if items are too easy and everyone is passing, you still get low discrimination.

PSYCHOMETRICS: Compare and contrast predictive and concurrent validity.

Concurrent and predictive validity are both types of criterion-related validity.
Concurrent validity measures whether a score on some particular test is related to a criterion measure taken at the same time. For example: current employees take a test and then those scores are correlated with a performance measure at the same point in time.
Predictive validity measures whether a score on some test is related to the criterion measure taken in the future. For example: a graduate student’s GRE score predicts that student will do well in the future, and later performance measured to see if he/she actually did well.

PSYCHOMETRICS: Explain what is meant by incremental validity.

Incremental validity means that adding additional predictors can explain more about the criterion measure and fill in gaps that the current predictors are missing. Basically, the more predictors that you have, the more you can theoretically increase your validity. For example, if I am measuring potential success in graduate school with just the GRE, my predictions might be “okay”. However, if I use GRE, GPA, personal statement, and letters of recommendation, I may be increasing validity.

PSYCHOMETRICS: Explain what the adjusted true score estimate is and how and why it will differ from the observed score? What three factors affect the discrepancy between the observed score and the adjusted true score estimate? Explain why/how those factors affect that discrepancy.

The adjusted true score estimate essentially describes the revision of your estimated true score.
So you have X1, which is your observed test score. Then you have information on the mean and the reliability of the test. Using these three pieces of information, you can find your adjusted estimate of your true score. The estimated true score= mean+reliability(X1-mean). For example, say I received a 20 on the test. The mean is 16 and the reliability is 0.8. Therefore, my adjusted true score estimate is 16+.8(20-16), which comes out to be 19.2.

The three factors that affect the discrepancy between the observed score and the adjusted true score estimate are the test reliability, the size of the difference between the observed score and the mean, and the direction of the difference between the observed score and the mean.

As test reliability decreases, the discrepancy between the observed score and the estimate will increase. As the size of the difference between the observed sore and the mean increases, or the extremeness of the score, the discrepancy between observed score and estimates will also increase. Finally, the direction of the difference between the observed score and the mean also effects the discrepancy because the scores tend to regress toward the mean. Therefore, if an observed score is above the mean, the estimate will likely be lower than the observed. If the observed score is below the mean, the estimate will likely be higher than the observed score.


Item 1 has an item difficulty level of 0.43, which indicates that 43% of respondents got the item correct. This difficulty level is good, because the ideal level of difficulty is around a 0.5. This item indicates a “moderate” difficulty level.
However, the item discrimination level of -.05 is not only low, but is also negative which is a bad thing. More “low” performers on the exam, or people who scored low on the overall test, got this item correct (14 people) than the “high” performers, or people who scored well on the overall test, did (9 people). Clearly, there is something wrong with this item and it should be removed. It could be that there is something confusing or misleading about the item.

Item 2 has an item difficulty level of 0.61, which indicates that 61% of respondents got the item correct. This is can be considered an “somewhat easy” difficulty level. More people are getting the item correct than wrong, which can be acceptable depending on the purpose of the test. For a Psych 101 test, this is fine.
The item discrimination of 0.17 is okay. Very few (2 out of 27) high performers got this item wrong, whereas most of the low performers (19 out of 27) got this item wrong. This shows that the item does indeed help to discriminate between high and low performers on the exam and therefore should remain on the exam. However, two of the alternatives (B and D) were not selected by anyone out of the 54 respondents. This item should still be examined to see if these two alternatives can be edited so that they seem more likely to be correct and are not as useless in this item.

