[Dataset] Model document update
Markus Demleitner
msdemlei at ari.uni-heidelberg.de
Thu Mar 31 11:46:56 CEST 2016
Hi Mark,
On Tue, Mar 29, 2016 at 02:18:04PM -0400, CresitelloDittmar, Mark wrote:
> On Mon, Mar 21, 2016 at 5:54 AM, Markus Demleitner <
> msdemlei at ari.uni-heidelberg.de> wrote:
> Rather than putting some in here, and some in NDCube, I thought it
> best to keep it all together. Do you have another suggestion on how
> to handle this dependency while STC2 is being reviewed?
I'm afraid I don't have enough understanding of the technical nature
of the dependency -- why doesn't a simple reference to some internal
working draft work for the moment?
> re: Characterisation
> Section 3 is for the ObsDataset extension, and is, therefore, one of the
> specific Dataset types which is pulling in Characterisation. Other types
> (eg: SimDataset if it were cast into this framework), may or may not
> pull in Characterisation, and may or may not want to extend that to
> include other simulation specific characterisation.
>
> Perhaps ObsDataset should be moved into the Observation/Experiment
> package-model.
My main interest is that I can use Dataset without having to process
the entire STC model *in VO-DML*. So, my intereset is that the
various VO-DML documents are independent. If that can be arranged,
I'm not so worried about the rest of the hierarchy.
> > (4) Talking about Curation.rights: This now has a multiplicity 0..1.
[...]
> > [my take: strike AccessRights and make Curation.rights point to
> > RightsType directly -- I don't think the potential benefit of having
> > this kind of thing machine-readable outweighs the cost in terms of
> > complexity]
> >
>
> When you say 'point to RightsType directly', that would not be possible as
> RightsType is a DataType.. it would be an attribute (as it was previously).
> I don't follow the 'machine-readable' part of your comment.
Well, machine-readable means that having DM attributes for the time
span of the access rights means that a computer can, in principle,
figure out when a dataset will change its status (and with higher
multiplicity, even figure out when that will be). Since I don't see
a use case proportional to the added complexity I indeed proposed
going back to having a plain atomic attribute.
> > (6) I'm not happy with the inflation of places where dataset
> > identifiers can stand. There's now Curation.publisherDID,
> > DataID.creatorDID, and DataID.datasetID. I don't think we're doing
> > our users a service by multiplying the concepts here, even though I
> > admit that each of these have a use case.
> >
> > I'd much rather see an Identifier type:
> >
> > Identifier.kind: (publisher, creator, persistent, ...)
> > Identifier.form: (doi, ivoid, generic-uri, ...)
> > Identifier.value: (well, you know).
[...]
> I haven't inflated anything. These are the same set which has been
> in the prior models. I do like the idea of using an Identifier
> type rather than anyURI. Should be more adaptable to evolving
> standards/forms. I would resist the 'kind' attribute. As I said
> above, these groupings are associated with the dataset by different
> parties and the distinction is pervasive across the existing
> Resource documents.
...and has lead to much confusion. I frankly don't see that these
different parts of a DM instance will be maintained by different
people. And for the publisher it's much easier if they have one
central location for all the various identifiers -- which also helps
making clear their relationships. It also helps when, for instance,
the creator has assigned both a DOI and an IVOA creatorDID assigned
to a dataset.
> > (7) Publication
> >
> > Here, we should be explicit about what the publication reference is.
> > Much as I would like the bibcode to rule supreme forever, this is
> > almost certainly not what is going to happen. Either this gets a
> > form attribute as in (6) or we say "This should be a URI with a
> > scheme; use bibcode: for bibcodes, doi: for DOIs. In a pinch,
> > non-URI, freetext references are ok".
> >
>
> Isn't this what 2.9.1 says? Is there specific language you'd like changed
> there?
Well, perhaps something like:
This should be interpreted as a URI. Bibcodes should use the ad-hoc
schema bibcode: (unless Alberto protests loudly), dois should use the
form with the doi: schema. Freetext references are discouraged. If
they are used nevertheless, they must not start with "[a-zA-z]+:" to
ensure they are not interpreted as URIs.
> (10) Having said that, I think orcids will become a smash hit in the
> > near future if they aren't one already. Hence, I'd add
> >
> > identifier
> >
> > to the Party attributes. The stuff on defining identifiers as in (7)
> > applies here, too (if we go the URI way, we should say whether we
> > want orcid:0000-... or http://orcid.org/0000-...)
> >
> >
> Can you elaborate? Having an ID at the Party level could be
> confusing.. as an individual (me/you) could have different ID
> depending on the Role we are playing at the time. That is why I
> left them up at the Role extensions (Publisher.publisherID).
Well, our orcid would presumably be the same, no? And even if I have
a different id when I'm a publisher than when I'm a creator: I'm not
sure it helps if the attributes have different names. Isn't it
enough that the two items differ by role?
Cheers,
Markus
More information about the dm
mailing list