[Dataset] Model document update

CresitelloDittmar, Mark mdittmar at cfa.harvard.edu
Fri Apr 1 14:18:18 CEST 2016


Markus,

On Thu, Mar 31, 2016 at 5:46 AM, Markus Demleitner <
msdemlei at ari.uni-heidelberg.de> wrote:

> Hi Mark,
>
> On Tue, Mar 29, 2016 at 02:18:04PM -0400, CresitelloDittmar, Mark wrote:
> > On Mon, Mar 21, 2016 at 5:54 AM, Markus Demleitner <
> > msdemlei at ari.uni-heidelberg.de> wrote:
> > Rather than putting some in here, and some in NDCube, I thought it
> > best to keep it all together.  Do you have another suggestion on how
> > to handle this dependency while STC2 is being reviewed?
>
> I'm afraid I don't have enough understanding of the technical nature
> of the dependency -- why doesn't a simple reference to some internal
> working draft work for the moment?
>
>
There isn't an internal working draft to refer to yet, only the UML and
vo-dml HTML on volute.  This is being worked.

>
> > re: Characterisation
> > Section 3 is for the ObsDataset extension, and is, therefore, one of the
> > specific Dataset types which is pulling in Characterisation.   Other
> types
> > (eg: SimDataset if it were cast into this framework), may or may not
> > pull in Characterisation, and may or may not want to extend that to
> > include other simulation specific characterisation.
> >
> > Perhaps ObsDataset should be moved into the Observation/Experiment
> > package-model.
>
> My main interest is that I can use Dataset without having to process
> the entire STC model *in VO-DML*.  So, my intereset is that the
> various VO-DML documents are independent.  If that can be arranged,
> I'm not so worried about the rest of the hierarchy.
>
>
>From the VO-DML perspecitive, they are separate models.  Char and STC-2 are
imported models.


> > > (4) Talking about Curation.rights: This now has a multiplicity 0..1.
> [...]
> > > [my take: strike AccessRights and make Curation.rights point to
> > > RightsType directly -- I don't think the potential benefit of having
> > > this kind of thing machine-readable outweighs the cost in terms of
> > > complexity]
> > >
> >
> > When you say 'point to RightsType directly', that would not be possible
> as
> > RightsType is a DataType.. it would be an attribute (as it was
> previously).
> > I don't follow the 'machine-readable' part of your comment.
>
> Well, machine-readable means that having DM attributes for the time
> span of the access rights means that a computer can, in principle,
> figure out when a dataset will change its status (and with higher
> multiplicity, even figure out when that will be).  Since I don't see
> a use case proportional to the added complexity I indeed proposed
> going back to having a plain atomic attribute.
>
>
OK.. not sure where to go with this.  Will give it some thought.


> > > (6) I'm not happy with the inflation of places where dataset
> > > identifiers can stand.  There's now Curation.publisherDID,
> > > DataID.creatorDID, and  DataID.datasetID.  I don't think we're doing
> > > our users a service by multiplying the concepts here, even though I
> > > admit that each of these have a use case.
> > >
> > > I'd much rather see an Identifier type:
> > >
> > >   Identifier.kind: (publisher, creator, persistent, ...)
> > >   Identifier.form: (doi, ivoid, generic-uri,  ...)
> > >   Identifier.value: (well, you know).
> [...]
> > I haven't inflated anything.  These are the same set which has been
> > in the prior models.  I do like the idea of using an Identifier
> > type rather than anyURI.  Should be more adaptable to evolving
> > standards/forms.  I would resist the 'kind' attribute.  As I said
> > above, these groupings are associated with the dataset by different
> > parties and the distinction is pervasive across the existing
> > Resource documents.
>
> ...and has lead to much confusion.  I frankly don't see that these
> different parts of a DM instance will be maintained by different
> people.  And for the publisher it's much easier if they have one
> central location for all the various identifiers -- which also helps
> making clear their relationships.  It also helps when, for instance,
> the creator has assigned both a DOI and an IVOA creatorDID assigned
> to a dataset.
>
>
Yes.. but I think we cleared up the confusion about the different IDs
(which would not change if they were bundled in a central location ).
I can see how a central location for the identifiers would be convenient
in some use cases, but in my opinion, it makes more sense to keep them
with the bundle of metadata assigned by the Party.



>
> > > (7) Publication
> > >
> > > Here, we should be explicit about what the publication reference is.
> > > Much as I would like the bibcode to rule supreme forever, this is
> > > almost certainly not what is going to happen.  Either this gets a
> > > form attribute as in (6) or we say "This should be a URI with a
> > > scheme; use bibcode: for bibcodes, doi: for DOIs.  In a pinch,
> > > non-URI, freetext references are ok".
> > >
> >
> > Isn't this what 2.9.1 says?  Is there specific language you'd like
> changed
> > there?
>
> Well, perhaps something like:
>
>   This should be interpreted as a URI.  Bibcodes should use the ad-hoc
>   schema bibcode: (unless Alberto protests loudly), dois should use the
>   form with the doi: schema.  Freetext references are discouraged.  If
>   they are used nevertheless, they must not start with "[a-zA-z]+:" to
>   ensure they are not interpreted as URIs.
>
> OK..


> > (10) Having said that, I think orcids will become a smash hit in the
> > > near future if they aren't one already.  Hence, I'd add
> > >
> > >   identifier
> > >
> > > to the Party attributes.  The stuff on defining identifiers as in (7)
> > > applies here, too (if we go the URI way, we should say whether we
> > > want orcid:0000-... or http://orcid.org/0000-...)
> > >
> > >
> > Can you elaborate?  Having an ID at the Party level could be
> > confusing.. as an individual (me/you) could have different ID
> > depending on the Role we are playing at the time.  That is why I
> > left them up at the Role extensions (Publisher.publisherID).
>
> Well, our orcid would presumably be the same, no?  And even if I have
> a different id when I'm a publisher than when I'm a creator: I'm not
> sure it helps if the attributes have different names.  Isn't it
> enough that the two items differ by role?
>
> If a scientist takes an orcid for use in tagging publications/datasets to
associate them with the scientist,
then wouldn't the orcid be associated with the Scientist role? or Author?
They would map to the same
person.  Would a Party want more than one orcid? to separate fields of
study or funding vs science?

I really think it is a Role thing.  I agree that the attribute name could
be consistent.  There is no need to retain "PublisherID" for an ID under in
the "Publisher" class.  It could be moved to Role proper, but in this work,
there didn't seem to be many roles which require an ID, so I left it to the
extensions.  Perhaps you have other use cases in mind?

Would that work for you?

Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.ivoa.net/pipermail/dm/attachments/20160401/db5a7613/attachment.html>


More information about the dm mailing list