Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@mbollmann
Copy link
Member

@mbollmann mbollmann commented Oct 22, 2021

Many PACLIC proceedings have URLs in their <doi> entry in the XML, not DOIs. This fixes that.

Technically, the current entries are Handle URLs, not DOI URLs, but from spot-checking it seems that they are actually valid DOIs (DOI uses Handle internally).

Compare, for example:

(h/t https://twitter.com/gchrupala/status/1451552455519506448)

@mbollmann mbollmann requested a review from a team October 22, 2021 17:25
@github-actions
Copy link

Build successful. You can preview it here: https://preview.aclanthology.org/fix-paclic-dois
This preview will be removed when the branch is merged.

@akoehn
Copy link
Member

akoehn commented Oct 22, 2021

Let's also adjust the schema to catch this kind of mistakes in the future.

Copy link
Member

@akoehn akoehn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, I made this "request changes" because I think some discussion is needed.

Yes, the DOI service resolve IDs such as 2065/12156. No, this is not a valid DOI. The DOI handbook states:

The DOI prefix shall be composed of a directory indicator followed by a registrant code. These two components shall be separated by a full stop (period).
The directory indicator shall be "10"

In other words: Every DOI needs to start with the character sequence 10..

Someone obviously put in the handle URI for these papers. We could 1) keep them in the doi field because we know they (currently!) resolve even though they are not DOIs 2) find out whether they have proper DOIs and insert them 3) remove the DOI fields and maybe add the handle URI as some other field.

It is too late for me to form a definitve opinion, but I would be hesitant to put a non-DOI identifier into a DOI field. DOIs are specifically made to be exact and we would water that down.

@mbollmann
Copy link
Member Author

Yes, I was about to write the same thing while you were posting this, @akoehn. :)

Funnily enough, even the currently generated nonsense link on the website resolves:

In that case, I'm not sure we currently have a mechanism to handle these cases. I think the DOI field is currently the only way to link to an external website like that.

@mjpost
Copy link
Member

mjpost commented Nov 21, 2021

So are there valid DOIs for these, then?

@akoehn
Copy link
Member

akoehn commented Nov 21, 2021 via email

@mjpost
Copy link
Member

mjpost commented Nov 22, 2021

I think we should

  1. Move the invalid DOIs to a new field, say <handle>
  2. Display them separately

We can split this into two steps, for example doing (1) in this PR and then adding (2) later when someone has time.

Reading the doc raises a separate question: we generate our DOI suffixes as v1/{anth_id}. Why the v1/? I'd suggest we get rid of it, but regenerating the old ones would cost $18,676, and there doesn't seem to be a compelling reason to change it moving forward, apart from aesthetics, which has to be balanced against consistency.

@mjpost
Copy link
Member

mjpost commented Nov 22, 2021

We could also do some something like <doi type="hdl"> for the handle.net instances, using doi here as a generic term, defaulting to the DOI brand.

@mbollmann
Copy link
Member Author

Making a new field would be very little work, it just produces extra code for what currently is a rare exception.

The <doi type="hdl"> way would also work, but I find it quite ironic given that in reality, "DOI" is a subtype of "Handle". :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants