Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@pdtouch
Copy link
Contributor

@pdtouch pdtouch commented Nov 28, 2023

Field Sequence Coordinates

Issue # 1414

Tripal Version: 4

Description :

This is required for Sequence Coordinates field to run in Tripal 4. Since the YML file for the field has been updated, this field should be active for the Gene page. The Tripal-3 version of this field can be viewed here:
https://github.com/tripal/tripal/tree/7.x-3.x/tripal_chado/includes/TripalFields/data__sequence_coordinates

Two formatters are present one for linear view of the coordinates and other for a tabular view.

Testing?

  1. Build docker image with the code in respective directories:
tripal_chado/config/install/tripal.tripalfield_collection.genomic_chado.yml
tripal_chado/src/Plugin/Field/FieldType/ChadoSequenceCoordinatesDefault.php
tripal_chado/src/Plugin/Field/FieldFormatter/ChadoSequenceCoordinatesFormatterDefault.php
tripal_chado/src/Plugin/Field/FieldFormatter/ChadoSequenceCoordinatesFormatterTable.php
tripal_chado/src/Plugin/Field/FieldWidget/ChadoSequenceCoordinatesWidgetDefault.php

  1. Build and run docker image as required and verify Tripal can be accessed from its URL.

  2. Access docker container using branch name : tv4g1-issue1414-data__sequence_coordinates

cntr_br_name=tv4g1-issue1414-data__sequence_coordinates
docker exec -it $cntr_br_name /bin/bash
  1. Change max_filesize value in settings file : /usr/local/etc/php/php.ini to a high value say 12MB.
    service apache2 restart and exit from container. (This step is not required to just publish gene imported via Tripal -> Data Loaders -> Chado GFF3 File Loader )

  2. Log into Drupal/Tripal GUI and create Citrus Organism Tripal Content
    Ref: https://tripal.readthedocs.io/en/latest/user_guide/example_genomics/organisms.html

  3. Create Analysis content :
    Ref: https://tripal.readthedocs.io/en/latest/user_guide/example_genomics/analyses.html

  4. Structure -> Tripal Content Types -> Import type collection -> Genomic Content Types for viewing Gene content.

  5. Load Feature Data (gene and sequence information)
    Ref: https://tripal.readthedocs.io/en/latest/user_guide/example_genomics/genomes_genes.html
    (Tripal -> Data Loaders -> Chado GFF3 File Loader )

  6. Publish the gene (Content -> Tripal Content -> +Publish Tripal Content -> Chado Storage, Content Type: Gene -> Publish)

  7. Click on the Tripal Content -> gene that was just created.

  8. If the field is working properly, the gene entered via the GUI would be viewable and editable with the relevant coordinate information under Sequence Coordinates.

  9. Structure -> Tripal Content Types -> Gene -> Edit -> Manage Display -> Change Sequence Coordinates formatter to "Chado sequence coordinates table formatter" and see coordinates being displayed as a table.

@dsenalik dsenalik added Tripal 4 Group 1 - Tripal Content Types | Terms | Fields Any issue relating to Tripal Content including types, terms, and fields. labels Nov 29, 2023
@laceysanderson
Copy link
Member

You will need to update your PR with the most recent 4.x which includes a change to the YAML file organization for adding fields. Essentially, content types and fields are now broken down by cactegory so you will want to move the changes you originally made to default_chado.yml to genomic_chado.yml.

@laceysanderson laceysanderson changed the title Tripal 4 files for sequence_coordinates-field: tv4g1-issue1414-data__… Tripal 4 sequence coordinate field Dec 3, 2023
@laceysanderson laceysanderson marked this pull request as draft December 3, 2023 15:31
@pdtouch pdtouch force-pushed the tv4g1-issue1414-data__sequence_coordinates branch from e884fe8 to b6e40af Compare December 4, 2023 16:53
@pdtouch
Copy link
Contributor Author

pdtouch commented Dec 4, 2023

YML file is modified and field files were pushed.

@pdtouch
Copy link
Contributor Author

pdtouch commented Jan 18, 2024

Table formatter for sequence coordinates has been added.

Copy link
Contributor

@dsenalik dsenalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. But I have a few suggestions.

  • We need to merge the 4.x branch (I tested your PR after doing this locally)
  • For the list formatter, there is a trailing semicolon that shouldn't be there
    20240123_sequence-coordinates-list
    I see for Tripal 3 for the list version of the formatter, the displayed value is generated by
    $locations[] = $srcfeature . ':' . $fmin . '..' . $fmax . $strand;
    and that phase is ignored. You are including phase, when present, which is good, and not including null values, also good.
    I think we just need to do like you find in tripal_chado/src/api/tripal_chado.query.api.php:928: $sql = mb_substr($sql, 0, -2); // Get rid of the trailing comma & space.
  • For the table formatter, in Tripal 3 it looks like
    20240124_tripal3-table
    and here it looks like
    20240123_sequence-coordinates-table

Note that you are displaying the gene ID instead of the name of the sequence that it is located on, this needs to be changed to the name of the sequence that it is located on.
Update: This cannot currently be done, see issue #1767

  • same change for list formatter for the name of the sequence

  • My personal preference would be to have the "Strand" displayed before "Phase" - but that's just my preference.

  • Strand should be displaying + - something is wrong here (the gene is feature_id=3 for my test)

select F.*, T.name from feature F left join cvterm T on F.type_id=T.cvterm_id;
 feature_id | dbxref_id | organism_id |             name             |                         uniquename                          | residues | seqlen |           md5checksum            | type_id | is_analysis | is_obsolete |      timeaccessioned       |      timelastmodified      |      name       
------------+-----------+-------------+------------------------------+-------------------------------------------------------------+----------+--------+----------------------------------+---------+-------------+-------------+----------------------------+----------------------------+-----------------
          3 |           |           1 | orange1.1g015632m.g          | orange1.1g015632m.g                                         |          |      0 |                                  |     165 | f           | f           | 2024-01-24 02:41:43.475274 | 2024-01-24 02:41:43.475274 | gene

select * from featureloc where feature_id=3;
featureloc_id | feature_id | srcfeature_id |  fmin   | is_fmin_partial |  fmax   | is_fmax_partial | strand | phase | residue_info | locgroup | rank 
---------------+------------+---------------+---------+-----------------+---------+-----------------+--------+-------+--------------+----------+------
             2 |          3 |             1 | 4058459 | f               | 4062210 | f               |      1 |       |              |        0 |    0

Edit: You can fix this problem with quotes

      $strand_symb = match( $strand_val ) {
        '-1' => '-',
        '1' => '+',
        default => 'unknown',
      };
  • for the case of a truly unknown strand, I would prefer a blank instead of unknown but again my personal preference.

@spficklin
Copy link
Member

Some comments on this field. Where the store action should be changed to read_value for the properties that link to fields in the featureloc table. Unless this field is meant to allow the user to manually set a location in the GUI for a feature then there's no need to store anything. If it's purpose is simply to read from the featureloc table to pull the fmin, fmax, strand, phase values then the read_value is what is needed.

@spficklin
Copy link
Member

Also, once the PR #1788 is merged, then this field should not have the problem with a path looping back on the feature table. ChadoStorage should be able to manage it.

@laceysanderson
Copy link
Member

@pdtouch the functionality Stephen mentioned is merged now so this field should work + be testable. Will you have time to update this PR based on the reviews?

@pdtouch
Copy link
Contributor Author

pdtouch commented Feb 25, 2024 via email

Malladi added 2 commits February 26, 2024 14:31
…oordinates-field: tv4g1-issue1414-data__sequence_coordinates
…oordinates-field: tv4g1-issue1414-data__sequence_coordinates
@dsenalik dsenalik self-requested a review February 27, 2024 03:04
Copy link
Contributor

@dsenalik dsenalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With my suggested changes I built a new docker and followed the testing steps. (I loaded the fasta, then the gff3.)
Coordinates appear as they should
2024-02-26_sequence_coordinates

Unfortunately, with the table formatter, the strand is not showing
2024-02-26_coordinates_table

I can add a test value for phase
sitedb=> update featureloc set phase=2 where featureloc_id=2;
and I have added another code suggestion to fix the strand, then the table format works too
2024-02-26_coordinates_table_fixed

…oordinates-field: tv4g1-issue1414-data__sequence_coordinates
@pdtouch pdtouch marked this pull request as ready for review February 27, 2024 04:05
Copy link
Contributor

@dsenalik dsenalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, with these changes this is the exact code I tested, so I can now approve

@dsenalik dsenalik merged commit 8dd76ff into tripal:4.x Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Group 1 - Tripal Content Types | Terms | Fields Any issue relating to Tripal Content including types, terms, and fields.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants