Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Aug 6, 2022. It is now read-only.

Conversation

@rrchai
Copy link
Contributor

@rrchai rrchai commented Jun 24, 2021

@tschaffter
I pushed the codes and updated json files for seeding challenges. The followings properties still need to be tested if they work fine with current codes and latest schema.

  • start/endDate: they have been changed to optional in the latest schema, since some challenge has no start/end dates
  • organizerIds, dataProviderIds, grantIds: check if empty array, [] is accepted in the latest services, since they are required

@tschaffter
Copy link
Member

@rrchai Thanks! I fixed the service issue that you reported about empty array (see Sage-Bionetworks/rocc-service#113). I'll now try to find an elegant way to fix the seeding using RxJS.

@tschaffter
Copy link
Member

tschaffter commented Jun 26, 2021

@rrchai Getting this error after your last commit:

POSThttp://localhost:4200/api/organizations?organizationId=applied-proteogenomics-organizational-learning-and-outcomes-network

EDIT: It's because organization ID can only be 60 char max (see schema)

@tschaffter
Copy link
Member

tschaffter commented Jun 26, 2021

Temporarily renaming this organization ID;

  • applied-proteogenomics-organizational-learning-and-outcomes-network => org-a
  • eunice-kennedy-shriver-national-institute-of-child-health-and-human-development => org-b

EDIT: No longer an issues after reverting @rrchai commit (see below).

@tschaffter
Copy link
Member

tschaffter commented Jun 26, 2021

@rrchai You replaced Challenge.organizationIds by Challenge.organizations, which is incorrect. i can't fix this because the value is now set to the name of the organization while we want the organization id.

EDIT: @rrchai I undid your last commit because it's introducing several issues. Let's discuss these issues next week so you can resubmit your changes without the issues.

@tschaffter
Copy link
Member

Removed the following organizations from challenge.json because their ID is too long (> 60 char).

  • eunice-kennedy-shriver-national-institute-of-child-health-and-human-development
  • applied-proteogenomics-organizational-learning-and-outcomes-network

@tschaffter
Copy link
Member

@rrchai Also remove the Person objects from the Challenge object, include the fake ids instead. Please use the default parameter names like organizerIds or grantIds instead of what I did previously for the grants (tmpGrantId). It will take care of managing the fake and real ids in the seeding script.

@tschaffter tschaffter changed the title Seeding challenges data - tmp Seeding DB with DREAM Challenges Jun 30, 2021
@tschaffter
Copy link
Member

@rrchai You can generate "fake" MongoDB ObjectId with the instruction bson.objectid.ObjectId() from the Python Package PyMongo.

@rrchai
Copy link
Contributor Author

rrchai commented Jun 30, 2021

@tschaffter I have updated files:

  • add organizerId to persons-fakeIds.json
  • add grantId to grants-fakeIds.json
  • update organizerIds and grantIds in challenge-fakeIds.json. Fake Ids in challenge objects match with the ids in person/grants objects.

@tschaffter
Copy link
Member

@rrchai Can you remove the following files?

  • challenges-fakeIds.json
  • grants-fakeIds.json
  • organizations-fix.json
  • persons-fakeIds.json

And update the remaining files so they meet the required defined in my previous comment. Ultimately we should only have the following files whose content match the ROCC schemas either based on the schemas {schema} (e.g. Tag, Organization) or {schema}CreateRequest.

  • challenges.json
  • grants.json
  • organizations.json
  • persons.json
  • tags.json

@tschaffter tschaffter self-assigned this Jul 1, 2021
@tschaffter tschaffter marked this pull request as ready for review July 1, 2021 00:51
@tschaffter tschaffter linked an issue Jul 1, 2021 that may be closed by this pull request
@tschaffter
Copy link
Member

@rrchai @vpchung FYI Here is a nice illustration on how Lodash can be used to manipulate objects. In this example, I use Lodash omit method to remove the property id from object in order to obtain {schema}CreateRequest objects that are ready to be posted. :)

@tschaffter tschaffter requested a review from vpchung July 1, 2021 00:59
@tschaffter
Copy link
Member

@rrchai @vpchung This PR is ready for your review. The remaining issue "Error: bundle initial-es5 exceeded maximum budget." should be fixed soon.

@rrchai I can't assign you as a reviewer because you create this PR. Do you mind giving a 👍 to this comment if you approve this PR?

Copy link
Member

@vpchung vpchung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just one design question about the future usage of Lodash.

import { mergeMap, tap } from 'rxjs/operators';
import { forkJoin, Observable, of } from 'rxjs';
import { map, mapTo, mergeMap, switchMap, tap } from 'rxjs/operators';
import { merge as _merge, omit as _omit } from 'lodash';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we plan on updating Lodash as newer versions come out? According to the creator, omit will be deprecated in v5 due to poor performance. We may need to rewrite the code if so.

Copy link
Member

@tschaffter tschaffter Jul 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will update libraries every month to please the Dependabot God, unless the update of a given library is not trivial. I learned about the existence of @babel/plugin-proposal-object-rest-spread by following the link you shared. I don't find this approach suitable here as it depends on the order of the object properties. I decided to go with _.pick and specify the properties to keep (array). I wasn't sure about how _.pick handled an array that includes property names that are not in a given object, for example Challenge.endDate is optional. _.pick behaves nicely and acts has a filter and it will not complain if a given object does not have the property. I'm still - and always - open to alternatives that you would like to share.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the _.pick approach as well! That being said, it may become quite tedious if numerous properties are needed... do you think that could happen? If so, we could always write a util function that mimics _.omit? I came across this article here that showcases different alternatives and their performances.

Copy link
Member

@tschaffter tschaffter Jul 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we still are in the early stage of development, it's likely that object properties will be updated. It's also more likely that we will add or rename properties - i.e. which will need us to update the seeding code - than removing properties from {schema} object to get {schema}CreateResponse. Also one thing that I don't like about _.pick is because its behavior that I described earlier can lead to a point where the list of properties specified is out of sync with the properties of {schema}CreateResponse.

We can throw away the pickBy option as it it too slow and doesn’t provide any benefits.

Assuming that pick has similar performance, then let's use one of the alternative proposed in the article. I just updated the code to an an omit function based on ESRest + JavaScript delete. This implementation does not allow to remove flatten paths, which should be fine for now.

Copy link
Contributor Author

@rrchai rrchai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great and I tested it works! @tschaffter

@tschaffter tschaffter merged commit 0378823 into main Jul 1, 2021
@tschaffter tschaffter deleted the seeding-challenge-data branch July 1, 2021 20:25
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

4 participants