This repository contains the code and data for the paper "Substance Beats Style: Why Beginning Students Fail to Code with LLMs".
-
We first apply a tagging process to every prompt in the first/last success/failure subsets of StudentEval. For example, given a prompt
"Outputs if the number input is even", the tagged prompt becomes"$Returns:prints$ if the number $parameter:input$ is even". This process is semi-automated.a.
mutated_dataset_builder/main.pyrule-based script that creates a preliminary tagged dataset nuprl-staging/studenteval_tagged_prompts. b. We transform this dataset to the file tagged_prompts_for_edits by runningjson_to_yaml.py. We manually edit this file. c. We map these edits back to a new split of the tagged dataset -
We then run bash script bin/prepare_subst.sh on the validated dataset to get various splits of substituted data base on target word and replacement value. Create a directory subst_experiments where the dataset will be stored in jsonl format.
./bin/prepare_subst.sh CATEGORY ORIGINALThe first argument is the category, the second argument is the replacement valueeg.
./bin/prepare_subst.sh "return" "output"replaces all occurrance of words tagged with category 'return' with the correct word variation of 'output'.(i.e. returns-outputs, returning-outputting.)eg.
./bin/prepare_subst.sh "loop through" "go through" -
We run generation script bin/run_generation.sh on the substitued datasets. Create a directory generation_experiments to store the generated model results. eg.
./bin/run_generation.sh "return" "output"This will create a dir return_output, with a sub dir completions_jsons storing all the json.gz files.
-
Follow instructions in
eval_scripts/README.mdto getstderr/stdoutoutputs for StudentEval completions, saved as a dataset. -
Use
student_trajectories/parse_graph.pyto turn the dataset into student trajectory graphs (saved as .yaml files). -
Use
student_trajectories/alternating_automata.pyto turn the graphs into alternating automata (saved as .dot files), which can be rendered into viewable.pdffiles using Graphviz. For an interactive.htmlprompt, usestudent_trajectories/plot_graph.py.
Copilot used in this project.