This was designed specifically for the ASOSU 2022 and 2023 election where data was ingested from Qualtrics. It was designed to validate the data and create a CSV file that can be used to determine the winners of the election.
While I am confident that the program operates correctly, this is not a production ready system.
Namely it has three main steps to validate the data:
- Check against registrar data to see if the voter is registered as an ASOSU student (a student at the Corvallis Campus)
- Check against previously submitted ballots to see if the voter has already submitted a ballot
- Select the winners for incentives as required. (Note: because of last year, this is still titled step 4)
program <day> <csv_of_ballots> - Processes the data for the given day.
program <start_day> <end_day> <csv_of_ballots> - Processes the data for the given range of days (inclusive).
./scripts/run.sh will run the program and accept the above arguments
This program expects a folder named data containing the following files:
seed.txt- a single line of text containing the seed for the random number generator used to select the winnersvalidVoters.csv- a CSV file containing the valid voters in the form ofFIRST_NAME,LAST_NAME,OSU_EMAIL,ONID_IDballots- a folder containing all the ballots submitted by the voters. It is expected that the files contain data for the day listed as well as all days prior. The format is too long to document here and must be customized for each election/ballot.alreadyVoted- a folder containing files in the form ofwhatever-<data_start_day>-<data_end_day>.csvwhich lists all the voters who have already voted on a given day. This data is deduped so there is no harm in having overlapping data. One ONID per line
Each step along the way, this program will output the data about what it did. The first number in each file is the step it corresponds to. The next text corresponds to the type of the data and then the final two numbers represent the start date and end date (inclusive) of the data. For example, 1-invalid-3-5.csv represents the invalid data from step one for days 3, 4, and 5.
Each step also outputs a summary with the number of valid, invalid, and total votes processed, as well as any additional log information that might be useful.
Step 2 outputs an additional file that can be copied directly into the alreadyVoted folder of the input. Step 4 outputs an additional file proving the ONID IDs of the winners.
The results folder contains the results for each race. It is only accurate when considering all data (so not a day at a time).
- Download data into
data/ballots rm -r output/./scripts/run.sh <day> <data_filename>- Copy the script output into the
outputfolder with filenamerawOutput.log - Upload to box. Data in the root of
outputinto the folder for each day. Upload the incentives winners into theIncentivesfolder. - Update ballot count spreadsheet and confirm that the numbers are correct
- Copy
alreadyVoted-<day>.csvintodata/alreadyVotedfolder - Repeat for all days with new data that is complete (no partial days)
./scripts/run.sh 0 100 <data_filename>to get accurate results- Upload
output/resultsto box
./scripts/notVoted.sh <ballot_file> - Creates a file (output/haveNotVoted.csv) that lists all the voters who have not voted.
./scripts/greek.sh <ballot_file> - Creates a file (output/greek-info.csv) that lists all the fraternity and sorority life organizations and their respective voter turnouts.