Data Glossary

Legacy Data

If your experiment is used by students using the legacy version of ASSISTments, you will be able to make a Legacy Data Request and receive a Google Doc with Legacy Analytics and five types of Legacy Data. Many variables appear in all five types of files produced by this report but detailed information on each file type is available at the links below.


Here is a youtube video (a less grainy version is here: video) from Dr Heffernan made in 2022 explaining these three types of files in the context of an example experiment using our newer data file formats. Here are three matching files: Action Level, Problem Level and Student Level.



Data From Within Your Experiment

Action Level Legacy Data - One row per student per action. This file may help you understand where all our available data types originate and is most useful for Educational Data Mining application, work with automated detectors, or the development of precise student models.

Problem Level Legacy Data - One row per student per problem. This file has fewer rows than the action level file because there are multiple actions per problem for each student. This file is most like a teacher report. The file may be the easiest to use when aggregating student performance across condition if using a pivot table or a similar data frame manipulation in R or Python.

Student Level Legacy Data - One row per student with problems represented as additional columns per variable. Problems are shown in opportunity order (i.e., the order in which each student experienced problems, which differs if using a random ordered section type). This is the easiest file to use to aggregate performance across conditions, although there may still be obstacles if your study took place in a Skill Builder and featured a posttest.

Student Level + Problem Level - A set of rows per student, with each row representing a variable (i.e., correctness) and each column representing the problems completed by the student. Problems are shown in opportunity order (i.e., the order in which each student experienced problems, which differs if using a random ordered section type). (Student Level and Problem Level for variable definitions).


Data From Before Your Experiment

Covariate Legacy Data - One row per student, providing data from your subject pool collected prior to their participation in your experiment. This file includes student level features (i.e., guessed gender), class level features (i.e., Homework Completion), and School level features (e.g., State, Urban/Suburban/Rural). It also has student X class level features (i.e., homework completion rate of a student z-scored within their class). This file can be linked to any other file using the Student ID variable.

New System Data

Samples of the data formats currently produced from users of our new system can be requested by emailing etrials@assistments.org. We will have new files types and a supplemental glossary available in the near future, with future data requests support by the E-TRIALS app and OSF project pages.