Skip to contents

A generated data set containing data on 1200 imaginary individual K-12 students in Wisconsin. They are nested within 6 schools in 3 districts. In adapting this from the source, Sam switched the school and district variables (there had been multiple districts per school) and made other minor changes, including dropping columns that I didn't understand or that didn't seem relevant (e.g., variables like "luck" that were used to calculate the reading and math scores).

Usage

wisc

Format

A data frame with 2700 rows and 26 variables:

student_id

numeric: student's unique ID #

grade

numeric: grade level

district

numeric: district code

school

numeric: school code

white

numeric: is the student white?

black

numeric: is the student black?

hisp

numeric: is the student Hispanic?

indian

numeric: is the student Native-American Indian?

asian

numeric: is the student Asian?

econ

numeric: is the student economically-disadvantaged?

female

numeric: is the student female?

ell

numeric: is the student an English Language Learner?

disab

numeric: does the student have a learning disability?

year

numeric: school year

attday

numeric: days attended

readSS

numeric: student's reading standardized test score

mathSS

numeric: student's math standardized test score

proflvl

factor: student's proficiency level

race

factor: student's single-category race

...

Source

https://github.com/jknowles/r_tutorial_ed/, posted under a Creative Commons license. The script used to generate the data set is here, although not very well documented: https://github.com/jknowles/r_tutorial_ed/blob/master/data/simulate_data.R