pengantar model linear terampat file08/03/2019 · pengantar model linear terampat (generalized...

33

Click here to load reader

Upload: ngodien

Post on 30-Apr-2019

349 views

Category:

Documents


24 download

TRANSCRIPT

Page 1: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

Pengantar Model Linear Terampat(Generalized Linear Model / GLM)

Dr. Kusman Sadik, M.Si

Program Studi Magister (S2)

Departemen Statistika IPB, Genap 2018/2019

Page 2: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

2

Pada model linear klasik, seperti regresi linear,

memerlukan asumsi bahwa peubah respon y menyebar

Normal.

Pada kenyataanya banyak ditemukan bahwa peubah

respon y tidak menyebar Normal. Misalnya menyebar

Binomial, Poisson, Gamma, Eksponensial, dsb.

Maka dikembangkan Model Linear Terampat (Generalized

Linear Model / GLM) untuk mengatasi masalah ini.

Metode GLM bisa digunakan untuk memodelkan peubah

respon y yang mengikuti sebaran keluarga eksponensial

(exponential family).

Page 3: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

3

Normal

Binomial

Multinomial

Poisson

Gamma

Eksponensial

Negatif Binomial

Dsb.

Page 4: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

4

Function: The structure of the association between the

variables (e.g., linear or some other function).

Parameters: How a change in a predictor variable, X, is

expected to affect an outcome variable, Y.

Partial parameters: How a change in one of the predictor

variables affects the outcome variable while controlling for

the effects of other predictor variables included in the model.

Smooth prediction: What the expected (or predicted) value of

the outcome variable might be for any given values of the

predictor variables.

Page 5: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

5

The random component : refers to the distribution of the

outcome variable (Y);

The systematic component : refers to the predictor

variables (X);

The link function : refers to the way in which the outcome

variable (or, more specifically, its expected value) is

transformed so that a linear relationship can be used to

model the association between the predictors (X) and the

transformed outcome.

Page 6: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

6

The random component of a GLM is the probability

distribution that is assumed to underlie the dependent or

outcome variable.

When the outcome or response variable is continuous,

such as in simple linear regression or analysis of variance

(ANOVA), we typically assume that the normal distribution

is the random component.

When the dependent or outcome variable is categorical it

can no longer be assumed that its values in the population

are normally distributed.

Page 7: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

7

The systematic component of a GLM consists of the

independent, predictor, or explanatory variables (X)

that a researcher hypothesizes will predict (or explain)

differences in the dependent or outcome variables.

These variables are combined to form the linear

predictor, which is simply a linear combination of the

predictors

Page 8: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

8

The key to GLMs is to “link” the random and systematic

components of the model with some mathematical

function, call it g(.), such that this function of the

expected value of the outcome can be properly modeled

using the systematic component:

The link function is the mathematical function that is

used to transform the dependent or outcome variable so

that it can be modeled as a linear function of the

predictors.

Page 9: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

9

In this case, the predicted or expected outcome, E(Y),

does not need to be transformed to be linearly related to

the predictor.

More technically, if g(.) represents the link function, the

transformation of E(Y) by g in this case is g(E(Y)) = E(Y).

This is referred to as the identity link function because

applying the g(.) function of E(Y) in this case results in

the same value, E(Y).

Page 10: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

10

For example, suppose that the outcome variable was the

probability that a student will pass (as opposed to fail) a

specific test, so the predicted value is E(Y) = π.

Transformation:

This particular link function (or transformation) is called the

logit link function, and the resulting GLM is called the

logistic regression model.

Page 11: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

11

When the outcome variable is a count variable, and thus the

random component is assumed to follow a Poisson

distribution.

The outcome variable is a count so by definition it cannot be

lower than zero, but if a linear regression model was fit using

the untransformed outcome, nonsensical negative values

could theoretically result as predictions for low values of X.

On the other hand, when the predicted outcome, E(Y), is

transformed using the natural log function,

Page 12: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

12

This particular transformation is called the log link function

and this model is called the Poisson regression model.

The log function typically works well with outcome variables

that represent counts or a random component that follows a

Poisson distribution.

Another GLM that uses the log link function is the log-linear

model, in which the predictor variables are typically

categorical and the outcome variable, rather than

representing yet another, separate variable, is the count or

frequency obtained in each of the categories of the

predictors.

Page 13: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

13

1. Komponen Acak (Random Component)

Komponen acaknya adalah peubah respon y.

Pada regresi linear, peubah respon y

diasumsikan menyebar Normal dengan nilai

tengah dan ragam 2.

E(y) = 0 + 1x1 + … kxk = (ixi)

Page 14: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

14

2. Komponen Sistematik (Systematic Component)

Komponen sistematik dalam regresi linear

adalah kombinasi linear dari kovariat x1, x2, …,

xp. Sehingga dapat dituliskan sebagai berikut:

= (ixi)

Page 15: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

15

3. Fungsi Hubung (Link Function)

Fungsi hubung pada regresi linear adalah fungsi yang

menghubungkan antara komponen acak (y) dengan

komponen sistematik (x1, x2, …, xp). Misalkan E(y) = ,

selanjutnya dapat dibuat hubungan sebagai berikut :

g((E(Y)) = g() = = (ixi) = X

g(.) pada regresi linear adalah fungsi identitas, yaitu

g() = = E(y).

Page 16: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

16

Sebaran y Fungsi Hubung

Normal Identitas

Binomial Logit

Gamma Invers

Poisson Log

Multinomial Logit Kumulatif

Negatif Binomial Log

Inverse Gaussian Invers Kuadrat

Page 17: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

17

Pendugaan Parameter

Metode Fisher Scoring

L(,y) adalah fungsi kemungkinan (likelihood), dan

I disebut matrik informasi Fisher. Maka penduga

secara iteratif adalah sebagai berikut :

srr

r

yLE

yLU

),( ;

),( 2

I

)1()1()1()()1( ˆˆ kkkkkUβIβI

)1()1()1()( )(ˆˆ kkkkUIββ

-

Page 18: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

18

Kelayakan model (goodness of fit) pada GLMdapat diukur berdasarkan Deviance (D).

Deviance adalah dua kali perbedaan antara loglikelihood nilai aktual dengan log likelihood nilaidugaan.

Nilai deviance dapat digunakan sebagai statistikuji mengenai kelayakan model.

Deviance merupakan peubah acak yangsebarannya mendekati sebaran 2.

Page 19: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

19

Sebaran asimptotik bagi deviance (D) adalah

2(n-p)

dimana n adalah banyaknya data, sedangkan p adalah banyaknya parameter dalam model.

)ˆ ;ˆ(2) ;(2 iiii yLyLD

Page 20: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

20

Page 21: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

21

Analisis Data Kategorik (ADK) sebagai bagian dari GLM

merupakan salah satu topik riset yang sangat kaya

pengembangan dan penerapannya.

Secara rutin sangat banyak artikel dengan topik ADK yang

terbit di jurnal internasional bereputasi.

Bahkan ada beberapa jurnal bereputasi internasional yang

mengkhususkan topiknya pada ADK.

Hal ini sangat berguna bagi mahasiswa sebagai rujukan

ketika melakukan penelitian dan membuat karya tulis

ilmiah yang topiknya ADK.

Page 22: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

22

Page 23: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

23

1

Page 24: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

24

2

Page 25: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

25

3

Page 26: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

26

4

Page 27: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

27

5

Page 28: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

28

Page 29: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

29

a. Install Program R versi terbaru.

b. Membaca input data dalam format : txt, excel, csv, dsb.

c. Deskripsi data melalui tabel dan grafik untuk data

kategorik: histogram, x-y plot, tabel frekuensi, tabel

kontingensi, dsb.

d. Deskripsi data secara numerik untuk data kategorik

e. Gunakan data pada Tabel 1 (terlampir) untuk

mengerjakan poin b, c, dan d tersebut.

Page 30: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

30

Responden J.Kelamin T.Pendidikan T.Pendapatan

1 1 1 4

2 1 3 6

3 1 2 4

4 1 2 5

5 1 4 4

6 1 4 1

7 1 3 3

8 0 4 3

9 1 1 5

10 1 2 5

11 1 2 2

12 1 3 5

13 0 4 5

14 1 4 4

15 1 3 3

16 1 3 4

17 0 4 4

18 1 1 6

19 1 2 3

20 0 4 3

21 1 1 6

22 1 1 2

23 1 3 3

24 1 1 5

25 1 2 3

Responden J.Kelamin T.Pendidikan T.Pendapatan

26 1 2 3

27 0 2 5

28 1 3 2

29 1 1 6

30 1 4 2

31 1 2 3

32 1 1 4

33 1 3 2

34 1 1 6

35 1 3 1

36 1 2 4

37 1 1 3

38 1 4 1

39 1 4 5

40 0 4 1

41 1 4 6

42 1 2 4

43 0 2 2

44 1 1 1

45 1 2 4

46 0 4 3

47 0 2 3

48 1 4 5

49 1 1 5

50 1 1 1

Page 31: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

31

Pustaka

1. Azen, R. dan Walker, C.R. (2011). Categorical Data

Analysis for the Behavioral and Social Sciences.

Routledge, Taylor and Francis Group, New York.

2. Agresti, A. (2002). Categorical Data Analysis 2nd. New

York: Wiley.

3. Pustaka lain yang relevan.

Page 32: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

32

Bisa di-download di

kusmansadik.wordpress.com

Page 33: Pengantar Model Linear Terampat file08/03/2019 · Pengantar Model Linear Terampat (Generalized Linear Model / GLM) Dr. Kusman Sadik, M.Si Program Studi Magister (S2) Departemen Statistika

33

Terima Kasih