Senin, 04 Mei 2009

RELIABILITAS DAN VALIDITAS DALAM PENGUKURAN

Ahmad Rohani HM.

(Dosen FAI Unissula Semarang)

RELIABILITAS

Reliabilitas adalah konsistensi atau stabilitas indikator-indikator empirik dari pengukuran ke pengukuran. Suatu instrumen yang reliabel menghasilkan hasil yang sama dalam pengukuran berulang. Reliabilitas adalah tingkat (luasan) mana suatu prosedur pengukuran menghasilkan hasil yang sama ketika percobaan-percoban (pengukuran) berulang.

Untuk memahami koefisien reliabilitas dapat dipahami melalui persamaan berikut:

X = T + E

X adalah skor teramati,

T adalah skor tulen,

E adalah kesalahan pengukuran

Suatu instrumen dapat dinyatakan reliabel jika secara akurat dapat menghasilkan skor tulen. Dengan kata lain, instrumen yang reliabel adalah instrumen yang menghasilkan komponen kesalahan seminimal-minimalnya.

Sehingga koefesien reliabilitas adalah proporsi reliabilitas tulen terhadap variabilitas yang diperoleh secara total. Koefisien reliabilitas 0,85 maknanya bahwa 85 % variabilitas skor yang diperoleh dapat dikatakan mewakili perbedaan individual dengan benar, dan 15 % variabilitas merupakan bagian kesalahan acak.

Pengukuran reliabilitas didasarkan pada korelasi antara 2 peristiwa:

· Repeated use of the instrument (stability)

· Similarity of items (homogeneity or internal consistency)

· Equivalence of two instruments (equivalence)

Perspektif konvensional tentang reliabilitass (AERA, 1985)

  • Stabilitas temporal: Form tes yang sama pada 2 atau lebih kesempatan terpisah terhadap kelompok teruji (test-retest). Pendekatan ini tidak praktis. Pengukuran berulang mungkin merubah teruji. Contoh, teruji akan adaptasi format tes dan cenderung skor tesnya lebih tinggi pada tes berikutnya.
  • Ekuivalensi form: 2 form tes yang berbeda, dengan isi tes yang sama, dalam satu kesempatan terhadap teruji yang sama (form alterasi).
  • Konsistensi internal: koefisien skor-skor tes yang diperoleh dari suatu tes atau survey (Cronbach Alpha, KR20, Spilt-half).
  • Reliabilitas merupakan suatu kebutuhan tetapi belum tentu mendukung validitas.
  • Performansi, portfolio, dan evaluasi responsive, di mana bermacam tugas (pertanyaan-pertanyaan) yang substansial dari siswa ke siswa dan pertanyaan-pertanyaan ganda dievaluasi secara simultan, adalah dilakukan untuk mengurangi reliabilitas. Suatu kesulitan, lebih dari satu sumber kesalahan pengukuran dalam asesmen performansi. Misalnya, reliabilitas skor tes keterampilan menulis dipengaruhi oleh rater, mode wacana, dan beberapa faktor lain (Parkes, 2000).

Perspektif Modern tentang Reliabilitas (Moss, 1994):

· Terdapat validitas tanpa reliabilitas.

· Reliabilitas adalah satu aspek validitas konstruk. Suatu asesmen yang kurang atau tidak standar disebabkan ada perbedaan antara reliabilitas dan validitasnya kabur.

· Inkonsistensi dalam performansi siswa terjadi ketika pertanyaan-pertanyaan dalam asesmen tidak valid. Ia menjadi teka-teki empirik untuk dipecahkan untuk mencari interpretasi yang lebih komprehensif.

Macam-macam Reliabilitas:

  1. Stability (memperoleh hasil yang sama melalui tes berulang): (a) test-retest; (b) parallel forms; (c) alternate forms.
  2. Homogeneity or Internal Consistency ( butir-butir instrument pengukuran memiliki konsep yang sama): (a) item-total correlation; (b) split-half reliability; (c) Kuder-Richardson coefficient; (d) Chronbach’s alpha (e) Theta; (f) Omega.
  3. Equivalence (memperoleh hasil yang sama ketika instrumen yang ekuivalen: (a) parallel items on alternate forms; (b) inter-rater reliability.
  4. Equivalence: Interrater Reliability: reliabilitas ini untuk mengestimasi dua atau lebih pengamat terhadap kejadian (pengamatan) yang sama dan independen untuk mencatat ubahan-ubahan menurut a pre-determined coding system. Selanjutnya hasilnya dikorelasikan, dan koefisen korelasi yang dihasilkan akan menggambarkan the strength of the relationship between one observer’s rating and the other’s. Metode lain untuk mengetahui ekuivalensi inter-rater adalah dengan cara mencari proporsi, yakni jumlah kesepakatan dibagi dengan jumlah kesepakatan dan ketidaksepakatan.

Perbedaan Reliabilitas dan Validitas (Salvucci, Walter, Conley, Fink, & Saba , 1997):

Banyak pakar mengatakan bahwa the traditional view that "reliability is a neccessary but not a sufficient condition of vaidity" is incorrect. This school of thought conceptualizes reliability as invariance and validity as unbiasedness. A sample statistic may have an expected value over samples equal to the population parameter (unbiasedness), but have very high variance from a small sample size. Conversely, a sample statistic can have very low sampling variance but have an expected value far departed from the population parameter (high bias). In this view, a measure can be unreliable (high variance) but still valid (unbiased).

VALIDITAS

Validitas adalah ukuran seberapa cermat suatu instrument (tes) melakukan fungsi ukurnya. Tes hanya dapat melakukan fungsi ukurnya dengan cermat jika ada sesuatu yang diukurnya. Dengan kata lain, tes harus mengukur sesuatu dan melakukannya dengan cermat.

Perspektif konvensional tentang ragam validitas

Cronbach (1971) menjelaskan macam validitas sebagai berikut: (1) Face validity: Face validity simply means the validity at face value. As a check on face validity, test/survey items are sent to teachers to obtain suggestions for modification. (2) Content validity: Draw an inference from test scores to a large domain of items similar to those on the test. Content validity is concerned with sample-population representativeness. i.e. the knowledge and skills covered by the test items should be representative to the larger domain of knowledge and skills. Content validity is sample-oriented rather than sign-oriented. A behavior is viewed as a sample when it is a subgroup of the same kind of behaviors. On the other hand, a behavior is considered a sign when it is an indictor or a proxy of a construct. (Goodenough, 1949). (3) Construct validity and criterion validity, which will be discussed later, are sign-oriented because both of them indicate behaviors different from those of the test. (4)Criterion: Draw an inference from test scores to performance. A high score of a valid test indiciates that the tester has met the performance criteria. Criterion validity is about prediction rather than explanation. Predication is concerned with non-casual or mathematical dependence where as explanation is pertaining to causal or logical dependence. For example, one can predict the weather based on the height of mercury inside a thermometer. Thus, the height of mercury could satisfy the criterion validity as a predictor. However, one cannot explain why the weather changes by the change of mercury height. Because of this limitation of criterion validity, an evaluator has to conduct construct validation. (5) Construct: Draw an inference form test scores to a psychological construct. Because it is concerned with abtsract and theoretical construct, construct validity is also known as theoretical construct.

Pandangan konvensional di atas diikuti oleh Austin (1997) dalam Ilene Decker. Austin menjelaskan validitas sebagai berikut : (1) Face Validity: assumptions of a logical tie between the items of an instrument and its purpose. (2) Content Validity: the items in the instrument are systematically judged by a panel of experts and rated as to the extent that the item adequately represents the construct proposed. Content validity is the consensus (intersubjective, negotiated) opinion of the community of scholars as to whether the items used to measure a latent variable (henceforth called a construct) refer to the domain of the construct and to no other construct. This assessment depends entirely upon the opinion of the community of scholars; it has no empirical element to it. (3) Criterion Related Validity: what is the relationship between the subject's performance on the measurement tool and the subject's actual behavior: (a) Concurrent validity: how well is the instrument measuring for the construct right now ? (b) Predictive validity: how well is the instrument able to predict future behavior ? (4) Construct Validity (how well does the instrument test a trait or concept): (a) convergent validity: look for another instrument that is proposed to measure the same construct and look for a correlation between the results; (b) divergent validity: you use an instrument that is supposed to measure the exact opposite of the trait; (c) a multi trait analysis: look at similarities in measures that could measure the same construct.

The American Psychological Association (APA, 1954) mengidentifikasi macam-macam validitas berdasarkan tujuan testing : (a) content validity; (b) predictive validity; (c) concurrent validity; (d) construct validity. Pada tahun 1966 the APA telah mereduksi predictive validity dan concurrent validity menjadi satu kategori yakni a single category: criterion-related validity.

Sehubungan dengan klasifikasi validitas dari the APA (1954 & 1966), Crocker and Algina (1986) tampaknya juga mengikuti klasifikasi 3 macam validitas: (1) Content validity studies are used to assess whether the items on an inventory or test adequately represent the construct of specific interest. In other words: Can the researcher draw an inference from an exmainee's test score to a larger domain of items like those that are on the test itself ?; (2) Criterion-related validity, encompassing both predictive validity and concurrent validity, is studied in situations where a test user wants to draw an inference about a person's test score to performance on a real behavioral variable that has practical importance; (3) Construct validity is studied when "the test user desires to draw an inference from the test score to performances that can be grouped under the label of a particular psychological construct" (Crocker & Algina, 1986, p. 218).

Terkait dengan construct validity Hunter and Schmidt (1990) mengatakan: construct validity is a quantitative question rather than a qualitative distinction such as "valid" or "invalid"; it is a matter of degree. Construct validity can be meaured by the correlation between the intended independent variable (construct) and the proxy independent variable (indicator, sign) that is actually used.

Beberapa pakar lain (Angoff,1988; Cronbach & Quirk, 1976) berpendapat bahwa construct validity tidak dapat dijelaskan hanya dengan a single coefficient, there is no mathematical index of construct validity. Rather the nature of construct validity is qualiatative. Ada 2 macam indicator: (1) reflective indictor: the effect of the construct; (2) formative indictor: the cause of the construct. When an indictor is expressed in terms of multiple items of an instrument, factor analysis is used for construct validation.

Perspektif modern tentang validitas (Messick, 1995):

Dalam perspektif modern (baru) validitas bukanlah suatu sifat tes atau pengukuran, malainkan lebih pada makna skor tes (the meaning of the test scores) yaitu :

  • Content: membuktikan relevansi konten, keterwakilan, dan kualitas teknis
  • Substantive: rasional teoretik
  • Structural: kejituan (ketaatan) the scoring structure
  • Generalizability: generalisasi terhadap the population and across populations
  • External: aplikasi terhadap perbandingan multitrait-multimethod
  • Consequential: bias, fairness, and justice; konsekuensi social asesmen terhadap masyarakat.

Selanjutnya, sebagai pandangan baru, Pedhazur & Schmelkin (1991) mengkritisi validitas dalam 2 hal:

  • Content validity is not a type of validity at all because validity refers to inferences made about scores, not to an assessment of the content of an instrument.
  • The very definition of a construct implies a domain of content. There is no sharp distinction between test content and test construct.

ANALISIS FAKTOR & VALIDITAS KONSTRUK

Analisis Faktor (Factor Analysis)

Oleh para pakar, analisis faktor didefinisikan dengan berbagai macam.

1. Reyment and Joreskog (1993; 71): Factor analysis is a generic term that we use to describe a number of methods designed to analyze interrelationships within a set of variables or objects [resulting in] the construction of a few hypothetical variables (or objects), called factors, that are supposed to contain the essential information in a larger set of observed variables or objects .... that reduces the overall complexity of the data by taking advantage of inherent interdependencies [and so] a small number of factors will usually account for approximately the same amount of information as do the much larger set of original observations.

2. Cureton and D'Agostino (1983; 1-2): Factor analysis as "a collection of procedures for analyzing the relations among a set of random variables observed or counted or measured for each individual of a group". The purpose, they said, "is to account for the intercorrelations among n variables, by postulating a set of common factors, considerably fewer in number than the number, n, of these variables".

3. Bryman and Cramer (1990; 253): Broadly defined factor analysis as "a number of related statistical techniques which help us to determine them [the characteristics which go together]".

4. Gorsuch (1983; 2) reminded the reader that "all scientists are united in a common goal: they seek to summarize data so that the empirical relationships can be grasped by the human mind". The purpose of factor analysis, he said, "is to summarize the interrelationships among the variables in a concise but accurate manner as an aid in conceptualization".

Keempat definisi di atas lebih menekankan pada left-brained individuals yaitu pemahaman terhadap sesuatu yang kompleks secara fair dan mudah.

Kerlinger (1979; 179-180) memberikan definisi analisis faktor secara seimbang, mencakup a left-brained and a right-brained: (1) For the left-brainers: Factor analysis is an analytic method for determining the number and nature of the variables that underlie larger numbers of variables or measures"; (2) For the right-brainers he noted: "It [factor analysis] tells the researcher, in effect, what tests or measures belong together--which ones virtually measure the same thing, in other words, and how much they do so". He further commented on factor analysis in terms of curiosity and parsimony. He noted, "Scientists are curious. They want to know what's there and why. They want to know what is behind things. And they want to do this in as parsimonious a fashion as possible. They do not want an elaborate explanation when it is not needed.". He sounds like a very right-brained individual!

Berdasarkan berbagai definisi di atas, dapat dipahami bahwa setiap definisi analisis faktor memiliki unsur-unsur umum (common elements). Masing-masing mengarah ke korelasi antar ubahan. Ini dapat disimak atas penggunaan kata interrelationships, intercorrelations dan relations. Lebih dari itu, bahwa setiap definisi menjelaskan the notion of reducing the number of variables into a smaller set of factors. Pendek kata, analisis faktor itu menjelaskan sesuatu dengan menyederhanakan sejumlah besar informasi ke dalam suatu form atau size yang manageable. Jadi, jelaslah definisi analisis faktor terkait dengan a right-brained individuals maupun a left-brained.

Validitas Konstruk (Construct Validity)

Apakah Construct validity sama dengan Factorial Validity ? atau The Only Validity ? Jawabnya adalah bahwa validitas konstruk mencakup validitas isi dan validitas kriteria. Beberapa argumen berikut dapat memperkuat jawaban ini : (1) Sheperd (1993) mengatakan: … that construct validity envelopes the empirical and the logical requirements of criterion and content validity. (2) Anastasi (1986) menyepakati bahwa … construct validity subsumes both content validity and criterion-related validity requirements. (3) Nunnally (1978; 111) juga menyatakan: … that "construct validity has [even] been spoken of as ... 'factorial validity' ".

Bahkan, sebenarnya konsep tersebut jauh sebelumnya telah diakui oleh Guilford (1946; 428) : (1) Guilford (1946; 428) mengatakan: The factorial validity of a test is given by its loadings in meaningful, common, reference factors. This is the kind of validity that is really meant when the question is asked: Does this test measure what it is supposed to measure ? (2) 44 tahun kemudian (setelah Guilford, 1946) Bryman and Cramer (1990; 253) mengatakan: … factor analysis enables us to assess the factorial validity of the questions which make up our scales by telling us the extent to which they seem to be measuring the same concepts or variables.

Bahwa Validitas konstruk (dan validitas kriteria sebagai suatu kasus khusus dari validitas konstruk, di mana the explanandum merupakan a behavioral variable) menunjukkan pada 3 jenis hubungan:

  1. Hubungan kausal antara konstruk dan variabel-variabel prediktornya (butir-butir yang merupakan skala digunakan untuk mengukur konstruk, misalnya, 10 pertanyaan dalam in Rosenberg's self-esteem scale).
  2. Hubungan kausal antara satu konstruk dan konstruk lainnya yang secara teoretik berhubungan dengannya.
  3. Hubungan non-kausal (korelasi) antara satu konstruk dan konstruk lainnya yang secara teoretik berhubungan dengannya.

Validitas Skor Tes

Bahwa validitas dan reliabilitas merupakan fungsi-fungsi dari skor-skor tes yang ditentukan oleh the test takers. Oleh karena itu menurut Sheperd (1992; 406) validitas harus ditetapkan pada setiap menggunakan tes. Sebelumnya, Cronbach (1971; 447) mengatakan: One validates, not a test, but an interpretation of data arising from a specified procedure. Lebih jauh Crocker and Algina (1986) menjelaskan: … a process used to provide the construct validity of an instrument. In addition, they described four procedures (one being factor analysis) frequently utilized in construct validation. Regardless of the specific technique used, the steps generally followed include (a) formulating a hypothesis about how those who differ on the proposed construct do in fact differ in relation to other constructs already validated, (b) selecting or developing a measurement instrument that consists of items specifically representing the construct, (c) gathering empirical data so the hypothesized relationships can be tested, and (d) determining if the data are consistent with the hypothesis.

Heppner, Kivlighan, and Wampold (1992) menganjurkan bahwa analisis faktor untuk kepentingan validasi konstruk dapat dilakukan dengan beberapa langkah: (a) the researcher must first carefully think about the specific research question he or she wishes to address, (b) he or she chooses to use or develop an instrument constituting the variables specified, (c) the researcher selects the sample, collects the data, and begins to factor analyze the data in order to identify the common dimensions of a set of variables and to see which items go together to make up a factor, and (d) the researcher determines if the factors are correlated. See? It's starting to come together. We're finding out: Are the test items measuring what they're supposed to be measuring? Construct validity and factor analysis constitute a natural pairing.

Menjadi jelas (dengan lebih memihak ke right brains) bahwa, analisis faktor digunakan untuk validitas konstruk. Menjadi lebih jelas dengan memahami tujuan analisis faktor adalah untuk menentukan faktor-faktor yang mendasari seperangkat variabel. Di samping itu kita dapat juga menetapkan the connection between factor analysis and its usefulness as a tool in evaluating score validity. Dengan kata lain: conducting a factor analysis of the observed scores on a given instrument, one can determine if indeed, the test is measuring the variables it purports to. This, in essence, is the definition of construct validation.

Analisis Faktor: Exploratory Versus Confirmatory

Exploratory factor analysis disingkat dengan EFA sedangkan confirmatory factor analysis CFA. Stevens (1996; 389) mengemukakan definisi yang left-brained tentang EFA dan CFA sebagai berikut: The purpose of exploratory factor analysis is to identify the factor structure or model for a set of variables. This often involves determining how many factors exist, as well as the pattern of the factor loadings ... EFA is generally considered to be more of a theory-generating than a theory-testing procedure. In contrast, confirmatory factor analysis (CFA) is generally based on a strong theoretical and/or empirical foundation that allows the researcher to specify an exact factor model in advance. This model usually specifies which variables will load on which factors, as well as such things as which factors are correlated. It is more of a theory-testing procedure than is EFA.

Stevens (1996) menjelaskan definisi tersebut (termasuk the right-brainers) melalui tabel berikut:

EXPLORATORY

THEORY GENERATING

CONFIRMATORY

THEORY TESTING

Heuristic - weak literature base

  • Determine the number of factors
  • Determine whether the factors are correlated or uncorrelated
  • Variables free to load on all factors

Strong theory and/or strong empirical base

  • Number of factors fixed a priori
  • Factors fixed a priori as correlated or uncorrelated
  • Variables fixed to load on a specific factor or factors

Terkait dengan EFA dan CFA sebagaimana dijelaskan Stevens di atas, secara khusus Cronbach (1988; 12-13) membedakan antara program yang kuat (EFA) dan program yang lemah (CFA) terhadap validitas konstruk: (1) Program yang lemah adalah sheet exploratory empiricism; sesuatu hubungan skor tes dengan ubahan lain adalah didatangkan … Program yang lemah dengan cukup terbuka memungkinkan sedikit bukti berhubungan dengan skor tes yang relevan dengan validitas; (2) Program yang kuat sebagaimana dijelaskan Cronbach dan Meehl (1955) serta Meehl dan Golden (1982), memerlukan satu gagasan teoretik seeksplisit mungkin, selanjutnya memikirkan tantangan dengan tenang dan hati-hati. Program yang kuat tak mungkin tanpa teori yang kuat, tetapi itu ideal.

Perbedaan antara program yang lemah dan program yang kuat dapat membingungkan. Ia mudah menyimpulkan, menggunakan program yang lemah, yang semua bukti validitas adalah bukti yang berhubungan dengan konstruk, dan karena itu semua interpretasi divalidasi menggunakan validitas konstruk. Program yang lemah tentu saja sesuatu tarikan di bawah satu payung yang menyatu. Kenyataannya, tarikannya juga banyak. Ketiadaan garis pedoman yang eksplisit untuk mengidentifikasi sebagian besar bukti yang relevan, program yang lemah secara esensial tanpa memberikan bimbingan terhadap validator. Pada pihak lain, ia tak begitu jelas bahwa program yang kuat perlu mencakup semua jenis usaha validasi.

Perkembangan 2 versi validitas konstruk yang bersaing mungkin tak dapat dielakkan. Formulasi yang pertama validitas konstruk memusatkan pada konstruk teoretik yang terdefinisikan secara implisit dalam term-term teori-teori formal. Formulasi yang bagus, elegant, tetapi jarang dikembangkan teori-teori formal dalam pendidikan dan ilmu-ilmu sosial, program yang kuat dari validitas konstruk yang umumnya tidak aplikabel dalam sesuatu seperti bentuk aslinya.

Beberapa kemajuan telah terjadi dalam perkembangan metode untuk mengimplementasikan model yang kuat (Campbell dan Fiske, 1959; Cronbach, 1971; Embretson, 1983; Messick, 1989), tetapi kehadiran model validitas konstruk selanjutnya relatif abstrak. Sehingga definisi validitas konstruk telah lepas untuk membuatnya lebih aplikabel, sementara label validitas konstruk berhubungan kuat dengan teori formal tetap bertahan. Sebagai hasilnya, program yang lemah validitas konstruk mengambil pada banyak keabstrakan dari program yang kuat tanpa dukungan teori formal untuk memberinya gigi (menguatkannya), menghasilkan sheer exploratory empiricism (Cronbach, 1988; 12).

Adopsi implisit dari program yang lemah tidak mempunyai pengaruh yang positif pada riset validasi. Program yang kuat telah di-outline oleh Cronbach dan Meehl (1955) yang mempunyai perhatian lebih terbatas tetapi kuat.

Exploratory Factor Analysis (EFA)

Analisis faktor mengasumsikan bahwa variabel teramati (terukur) adalah merupakan kombinasi linear dari banyak sumber yang mendasari variabel-variabel (atau faktor-faktor). Asumsi bahwa eksistensi sistem yang mendasari faktor-faktor dan system variabel-variabel teramati. Terdapat korespondensi yang pasti antara dua sistem dan analisis. Korespondensi ini menghasilkan konklusi mengenai faktor-faktor (Kim, 1986; 8).

Bahwa EFA dapat digunakan sebagai metode untuk menentukan jumlah minimum faktor-faktor hipotetik pokok yang mewakili sejumlah besar variabel. Dalam EFA pekerjaan ini ditunjukkan oleh interkorelasi antara variabel-variabel tanpa mempunyai spesifikasi faktor-faktor sebelumnya.

Definisi faktor-faktor yang mendasarkan left-brained and right-brained individuals diberikan oleh Cureton and D'Agostino's (1983; 3): The factors are random variables that cannot be observed or counted or measured directly, but which are presumed to exist in the population and hence in the experimental sample .... they are sometimes termed latent variables.

Tinsley and Tinsley (1987; 414) menyatakan: factors are hypothetical constructs or theories that help interpret the consistency in a data set. Kim and Meuller's (1978; 12 & 77) mendefinisikan: factors are "hypothesized, unmeasured, and underlying variables which are presumed to be the sources of the observed variables ... which are smaller in number than the number of observed variables, [and] are responsible for the covariation among the observed variables.

Kemudian, Cureton and D'Agostino (1983; 3) menjelaskan sifat hipotetik faktor-faktor: The factors are actually hypothetical or explanatory constructs. Their reality in the individuals of the population or sample is always open to argument. At the conclusion of a factor analysis we can only say of the factors that if they were real, then they would account for the correlations found in the sample. Sementara, Kline (1994; 5) mendefinisikan: a factor as a dimension or construct which is a condensed statement of the relationship between a set of variables.

Dari berbagai definisi mengenai faktor (faktor-faktor) dapat dipahami bahwa, secara esensial, faktor-faktor adalah bersifat latent (unobserved), hypothetical, konsep-konsep (konstruk-konstruk) pokok yang deduktif yang berasal dari korelasi antara variabel-variabel terukur (teramati) dari instrument atau tes.

Bacaan yang bermanfaat

American Psychological Association. (1954). Technical recommendations for psychological tests and diagnostic techniques. Psychological Bulletin, 51, 201-238.

American Psychological Association. (1966). Standards for educational and psychological tests and manuals. Washington, DC: Author.

American Educational Research Asociation, American Psychological Association, & National Council on Measurement in Education. (1985). Standards for educational and psychological testing. Washington, DC: Authors.

Anastasi, A. (1986). Evolving concepts of test validation. Annual Review of Psychology, 37, 1-15.

Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. Fort Worth: Harcourt Brace Jovanovich College Publishers.

Cronbach, L.J. (1971). Test validation. In R.L. Thorndike (Ed.), Educational measurement (2nd ed., pp. 443-507). Washington, DC: American Council on Education.

Decker, Ilene (1997). mezza@jan.ucc.nau.edu Web site created by the NAU OTLE Faculty Studio Northern Arizona University ALL RIGHTS RESERVED: “Reliability and Validity”

Guilford, J.P. (1954). Psychometric Methods. NY. McGraw-Hill Book Company, INC.

Hunter, J. E.; & Schmidt, F. L. (1990). Methods of meta-analysis: Correcting error and bias in research findings. Newsbury park: Sage Publications.

Kane, Michael T. (2001). Current concern in validity theory. Journal of Educational Measurement (JEM), Winter 2001, Vol. 38, No. 4, pp.319-342].

Kerlinger, F.N. (1979). Behavioral research: A conceptual approach. Dallas: Holt, Rinehart and Winston.

Kim, J.O., & Mueller, C.W. (1978). Introduction to factor analysis. Beverly Hills: Sage Publications.

Nunnally, J.C. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill.

Pedhazur, E. J.; & Schmelkin, L. P. (1991). Measurement, design, and analysis: An integrated approach. Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers.

Rohani, Ahmad. (2002) Model Konstruk dan Validitas Konstruk. [Disadur dari Michael T. Kane. (2001). Current concern in validity theory. Journal of Educational Measurement (JEM), Winter 2001, Vol. 38, No. 4, pp.319-342].

Stapleton, Connie D. (1997). Basic Concepts in Exploratory Factor Analysis (EFA) as A Tool to Evaluate Score Validity: A Right Brained Approach. Texas: A&M University. Paper presented at the annual meeting of the Southwest Educational Research Association, Austin, January, 1997

Stevens, J. (1996). Applied multivariate statistics for the social sciences (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.

Tidak ada komentar: