Which method of assessing reliability is used to determine the reliability of a single test given on a single occasion?

Recommended textbook solutions

Which method of assessing reliability is used to determine the reliability of a single test given on a single occasion?

Myers' Psychology for AP

2nd EditionDavid G Myers

900 solutions

Which method of assessing reliability is used to determine the reliability of a single test given on a single occasion?

Myers' Psychology for the AP Course

3rd EditionC. Nathan DeWall, David G Myers

955 solutions

Which method of assessing reliability is used to determine the reliability of a single test given on a single occasion?

Psychology

1st EditionArlene Lacombe, Kathryn Dumper, Rose Spielman, William Jenkins

580 solutions

Which method of assessing reliability is used to determine the reliability of a single test given on a single occasion?

A Concise Introduction to Logic

13th EditionLori Watson, Patrick J. Hurley

1,967 solutions

  1. Research Methods
  2. Reliability

What is Reliability?

By Dr. Saul McLeod, published 2013


The term reliability in psychological research refers to the consistency of a research study or measuring test.

For example, if a person weighs themselves during the course of a day they would expect to see a similar reading. Scales which measured weight differently each time would be of little use.

The same analogy could be applied to a tape measure which measures inches differently each time it was used. It would not be considered reliable.

If findings from research are replicated consistently they are reliable. A correlation coefficient can be used to assess the degree of reliability. If a test is reliable it should show a high positive correlation.

Of course, it is unlikely the exact same results will be obtained each time as participants and situations vary, but a strong positive correlation between the results of the same test indicates reliability.

There are two types of reliability – internal and external reliability.

  • Internal reliability assesses the consistency of results across items within a test.
  • External reliability refers to the extent to which a measure varies from one use to another.

Assessing Reliability

Which method of assessing reliability is used to determine the reliability of a single test given on a single occasion?

Split-half method

The split-half method assesses the internal consistency of a test, such as psychometric tests and questionnaires. There, it measures the extent to which all parts of the test contribute equally to what is being measured.

This is done by comparing the results of one half of a test with the results from the other half. A test can be split in half in several ways, e.g. first half and second half, or by odd and even numbers. If the two halves of the test provide similar results this would suggest that the test has internal reliability.

The reliability of a test could be improved through using this method. For example, any items on separate halves of a test which have a low correlation (e.g. r = .25) should either be removed or re-written.

The split-half method is a quick and easy way to establish reliability. However, it can only be effective with large questionnaires in which all questions measure the same construct. This means it would not be appropriate for tests which measure different constructs.

For example, the Minnesota Multiphasic Personality Inventory has sub scales measuring differently behaviors such as depression, schizophrenia, social introversion. Therefore the split-half method was not be an appropriate method to assess reliability for this personality test.

Test-retest

The test-retest method assesses the external consistency of a test. Examples of appropriate tests include questionnaires and psychometric tests. It measures the stability of a test over time.

A typical assessment would involve giving participants the same test on two separate occasions. If the same or similar results are obtained then external reliability is established. The disadvantages of the test-retest method are that it takes a long time for results to be obtained.

Beck et al. (1996) studied the responses of 26 outpatients on two separate therapy sessions one week apart, they found a correlation of .93 therefore demonstrating high test-restest reliability of the depression inventory.

This is an example of why reliability in psychological research is necessary, if it wasn’t for the reliability of such tests some individuals may not be successfully diagnosed with disorders such as depression and consequently will not be given appropriate therapy.

The timing of the test is important; if the duration is to brief then participants may recall information from the first test which could bias the results.

Alternatively, if the duration is too long it is feasible that the participants could have changed in some important way which could also bias the results.

Inter-rater reliability

The test-retest method assesses the external consistency of a test. This refers to the degree to which different raters give consistent estimates of the same behavior. Inter-rater reliability can be used for interviews.

Note, it can also be called inter-observer reliability when referring to observational research. Here researchers observe the same behavior independently (to avoided bias) and compare their data. If the data is similar then it is reliable.

Where observer scores do not significantly correlate then reliability can be improved by:

  • Training observers in the observation techniques being used and making sure everyone agrees with them.
  • Ensuring behavior categories have been operationalized. This means that they have been objectively defined.

For example, if two researchers are observing ‘aggressive behavior’ of children at nursery they would both have their own subjective opinion regarding what aggression comprises. In this scenario, it would be unlikely they would record aggressive behavior the same and the data would be unreliable.

However, if they were to operationalize the behavior category of aggression this would be more objective and make it easier to identify when a specific behavior occurs.

For example, while “aggressive behavior” is subjective and not operationalized, “pushing” is objective and operationalized. Thus researchers could simply count how many times children push each other over a certain duration of time.

How to reference this article:

How to reference this article:

McLeod, S. A. (2007). What is reliability? Simply Psychology. www.simplypsychology.org/reliability.html

APA Style References

Beck, A. T., Steer, R. A., & Brown, G. K. (1996). Manual for the beck depression inventory The Psychological Corporation. San Antonio, TX.

Hathaway, S. R., & McKinley, J. C. (1943). Manual for the Minnesota Multiphasic Personality Inventory. New York: Psychological Corporation.

Home | About Us | Privacy Policy | Advertise | Contact Us

Simply Psychology's content is for informational and educational purposes only. Our website is not intended to be a substitute for professional medical advice, diagnosis, or treatment.

© Simply Scholar Ltd - All rights reserved

Which methods is used for determining reliability of a test?

Three important methods for estimating test reliability are (1) method of parallel forms, (2) test-retest method, (3) split-half method.

What is the most commonly used method of assessing reliability?

The most common way to measure parallel forms reliability is to produce a large set of questions to evaluate the same thing, then divide these randomly into two question sets. The same group of respondents answers both sets, and you calculate the correlation between the results.

What are 3 types of reliability assessments?

Reliability refers to the consistency of a measure. Psychologists consider three types of consistency: over time (test-retest reliability), across items (internal consistency), and across different researchers (inter-rater reliability).

What are the two methods of reliability?

There are two types of reliability – internal and external reliability. Internal reliability assesses the consistency of results across items within a test. External reliability refers to the extent to which a measure varies from one use to another.