Lee la versión en español
Note: to open links in new tab use CTRL+click (Windows and Linux) or CMD+click (MacOS).
Agricultural Data Template Creator is a set of functions written in python that allow you to create custom templates to collect data for most of the typical agronomic experiments. It also can be used as template to load a "field" in the Field Book app.
-
Option 1
- Using a Jupyter Hub enviroment.
- For Purdue University members only Jupyter Hub is available at https://notebook.scholar.rcac.purdue.edu/. It requires the use of BoilerKey Two-Factor Authentication.
-
Option 2
- You should make a simple installation of either JupyterLab or Jupyter Notebook, or you also can install an environment management such as conda, mamba, or pipenv.
- Option 1
- Using Requirements File
pip install -r requirements.txt- option 2
-
Install the requiered libraries using the pip package installer for Python.
-
pip install pyyaml
-
pip install pandas
-
-
Clone option
- Open a new Jupyter Notebook Terminal
New > Terminal
- Clone the GitHub repository
git clone https://github.com/Purdue-LuisVargas/AgTC.git
-
Download option
- Download AgTC from the Github repository: https://github.com/Purdue-LuisVargas/AgTC.
- Unzip the entire folder, then copy (if running Jupyter locally) or upload the downloaded files (if using the Jupyter Hub environment) in your Jupyter Notebook directory.
Upload the Initial Template file to the ./input folder. It should be a CSV format file and it has to contain the information that changes for each plot and that you want to maintain in the New Template, such as plot number, repetition number, genotype name, etc. The table below is an example of columns and rows that a Initial Template could have:
| plot | genotype | repetition |
|---|---|---|
| 1 | PI594301 | 3 |
| 2 | LD-07-3395bf | 23 |
| 3 | SA1730464 | 18 |
| 4 | PI154189 | 14 |
| 5 | PI594451 | 2 |
| ... | ... | ... |
| 71 | CR16-0042 | 20 |
| 72 | SA1811280 | 25 |
| 73 | E19517GT | 24 |
| 74 | LD-07-3395bf | 23 |
| 75 | PI6548362 | 8 |
We encourage you to use controlled vocabularies and ontologies to name crop traits and variables. Learn about crop ontologies at https://cropontology.org/about.
The config.yml file is a YML file that could be edited using Jupyter or a text editor. The file is divided into six blocks of configurations, where each block is identified with uppercase letters. A BLOCK COLLECTION could have keys, values and/or sequences.
For instance, the following data structure contains the most common items in the config.yml file.
TEMPLATE_INPUT:
Folder : ./input/
Sample_name:
- A
- BWhere:
TEMPLATE_INPUT = BLOCK COLLECTION
Folder = Key
Sample_name = Key
./input/ = Value
A, B = Sequences
The data structures in the config.yml file could have one of the following arranges.
- Case 1
BLOCK COLLECTION:
Key: Value- Case 2
BLOCK COLLECTION:
Key:
- Sequences- Case 3
BLOCK COLLECTION:
- Sequences You need to update the config.yml file using information about the experiment.
General rules
- It is NOT recommendable to change BLOCK COLLECTION name items. If you change them, you should update the new name in the following line code on the main.ipynb Jupyter file:
functions.create_new_template('config.yml', 'TEMPLATE_INPUT', 'COLUMNS_TEMPLATE', 'NEW_COLUMNS', 'SAMPLES_PER_PLOT', 'SAMPLE_IDENTIFIER', 'TEMPLATE_OUTPUT')-
You can modify the name of any Key item. Additionally, for some BLOCK COLLECTION items, you should delete or add more depending on the experiment information.
-
You should update Value items with the experiment information, but you cannot add more than one item per Key.
-
You should delete, add or modify Sequences items as needed.
-
Be sure to keep the correct indentation. Use the spacebar instead tab key.
Used to define the path and name of the Initial Template file.
-
You can modify the name of the Key items.
-
You should update the Value items (e.g. trialInformation_PPAC_soybean_M2_y22.csv from the Input_template_file_name Key) with your file Initial Template name.
TEMPLATE_INPUT:
Folder : ./input/
Input_template_file_name : trialInformation_PPAC_soybean_M2_y22.csvIt allows to specify the column names that will be selected from the Initial Template File.
- Even if you want to maintain all the columns from the Initial Template file, you should write their names as Sequences. Add or delete as many Sequences as necessary.
COLUMNS_TEMPLATE:
- Plot
- Repetition
- GenotypeIt allows to write the names and values for the new columns that will be added to the New Template.
-
You can add, delete or modify Key items based on your experiment information.
-
You should update the Value items based on your experiment information.
NEW_COLUMNS:
Experiment : ACRE-Biomass
Season : y22
Environment : Early planting date
Measurment : Plant height
Sampling_identifier : Sampling-2This section is used to specify the number of repetitions of a measurement on the same experimental unit (plot, pot, growth chamber, etc.). It would create a row for each subsample name that the user indicates.
- You should specify the Sequences with the name that you want to identify each subsample. Modify, delete, or change the Sequences according to the measurement characteristics for which the Template will be created.
SAMPLES_PER_PLOT:
Sample_name:
- A
- B
- C
- DIt allows to specify the names of the columns with the values that will be used to create a unique identifier for each subsample (row).
-
You should NOT add o delete Key items, but you can change its name. The Key name is the name of the column that the New Template will have.
-
You should update the Sequences items according to the experiment information. Add or delete as many Sequences as necessary. Make sure that the Sequences names exist as key in the items COLUMNS_TEMPLATE, NEW_COLUMNS and SAMPLES_PER_PLOT.
SAMPLE_IDENTIFIER :
id_sample:
- Plot
- Sample_name
- Experiment
- Environment
- Season
- Measurment
- Sampling_identifierIt allows to specify the column names which values will be used to create the New Template file name.
- You should update the Sequences items according the experiment information. Add or delete as many Sequences as necessary. Make sure that the Sequences names exist as key in the NEW_COLUMNS item.
TEMPLATE_OUTPUT:
- Measurment
- Sampling_identifier
- Experiment
- Environment
- SeasonOpen the main.ipynb file in Jupyter, execute the two block of instructions. You would find your New Template in the ./output folder.
Vargas-Rojas L, Ting T-C, Rainey KM, Reynolds M and Wang DR (2024) AgTC and AgETL: open-source tools to enhance data collection and management for plant science research. Front. Plant Sci. 15:1265073. doi: 10.3389/fpls.2024.1265073.
Luis Vargas Rojas - [email protected]
Purdue University, Wang Lab dianewanglab.com