linking capabilities with tablesets
gilles landais
gilles.landais at astro.unistra.fr
Thu Mar 28 15:15:29 CET 2024
Hi Markus
it is interesting and I am completely open to put the need in a broader
context (like the time-serie and other use-cases).
In the same time, I highlight that we have also to take into account
that data are submitted to process and that the registry records are
exploited by many agents (api, softwares, external platforms (eg: eudat))
So, yes we could ask for a splinter!
Sincerely,
Gilles
Le 28/03/2024 à 10:27, Markus Demleitner via registry a écrit :
> Dear Gilles,
>
> On Mon, Mar 25, 2024 at 02:50:27PM +0100, gilles landais via registry wrote:
>> Changing the VizieR granularity from catalogue to datasets is of course not
>> possible for so many reason that I will not detail here.
> Well, this is not (necessarily) about VizieR, but only VizieR's
> interface to the VO Registry. I'd respectfully suggest that creating
> two or more registry records out of a single internal VizieR record
> might not cause too much upheaval in your code and resource
> management.
>
> But sure, it's easy for me to say that. I don't do that overly
> lightly though, but because I am really convinced that unless we
> completely overturn the way we are doing data discovery, everything
> *will* creak and groan as long as you squeeze -- in the Gaia example
> -- a source catalogue and a collection of time series into one
> resource record.
>
> This is not only a problem for the human-readable description as
> outlined in my previous mail. Let me in particular point out the
> product-type issue again:
> https://github.com/ivoa-std/VODataService/pull/1.
>
> Suppose a client discovers "There's time series in
> ivo://cds.vizier/i/355". How does it then decide it has to query
> I/355/epphot rather than one of the other tables in the resource
> record? And if you merge in the XP spectra in this record, too, even
> a link to the obscore table won't help any more.
>
> This problem immediately disappears if you create an extra resource
> record -- which, again, doesn't mean that VizieR has to change
> anything with the internal management of the resource, except perhaps
> some extra hint of the type "make a time series resource record with
> this table and capabilities A and B here".
>
>
>> Just keep in mind that a catalogue correspond to a reference article and can
>> contain several datasets(tables).
> Sure -- but there is nothing wrong with having multiple resource
> records with the same content/source field.
>
>> The logic to gather these datasets in a single entity that includes common
>> metadata is a valuable capability of the registry.
> ...and of course there's nothing wrong with sharing whatever metadata
> is common between these resource records.
>
>> Linking tablesets with its interfaces seem to be natural.
> I'll read this as "Linking tables with capabilities". Any yes, it's
> certainly something that's natural if you have several pairs of
> tables and interfaces in the resource record that are strongly
> related within the pair but only very loosely across the pairs.
>
> Regrettably, to the rest of what we do in VO discovery at this time
> it's not natural. Before we go this way, we will need a good plan
> how these relationships will be exploited *in discovery*. Continuing
> the example above, suppose someone writes, with a future
> pyvo.registry Dataproducttype constraint:
>
> rscs = pyvo.registry.search(
> Dataproducttype('time-series'),
> Keywords("Gaia"))
>
> What happens then? How would be implement
> rscs[0].look_for_a_time_series_at(ra, dec)?
>
> Or course, a similar challenge happens for my "Recent results from
> the Volute radio dish":
>
> rscs = pyvo.registry.search(
> Servicetype("scs"),
> Keywords("Quasars"))
>
> What do I do then to actually query the table with the quasars rather
> than that of cataclysmic binaries?
>
>> Talking about protocol - ok, but I think that it is another debate!
>> I agree and in favor to create a new SCS, more DALI compliant and that would
>> serve tablesets collection - It would be a good feature, may be better
>> included in the registry. But it doesn't exist yet and when it will, this
>> migration of an architecture with service specific to dataset to an
>> architecture that serve tablesets collection (like TAP) will impact a non
>> negligible part of the VO architecture.
> I am convinced that making the table-capability link work in practice
> is significantly more work than that -- but I'd be happy to be taught
> otherwise, of course.
>
>> For the story, we discovered this issue recently when we added notebooks in
>> order to educate our users with pyvo.
> For the benefit of innocent bystanders: That's
> <https://github.com/astropy/pyvo/pull/505> and its immediate
> environment.
>
> I have to say that I think the solution we came up with in pyVO is
> not unreasonable (and neither is TOPCAT's handling of this kind of
> thing) given the metadata we have and the structure of *that*
> problem, and I can't really see how the extra table-capability link
> could improve on that.
>
> So, I think the immediate pain is somewhat relieved, in particular
> because clients notice something is amiss when they try to use
> multi-conesearch resources without another look.
>
> But I give you it does not solve the discovery problem I outlined
> with the Volute Radio Dish mock example. However, neither does the
> table-capability link, at least without serious changes to our
> Registry operations.
>
> Perhaps we should have a Registry running meeting on this?
>
> -- Markus
>
More information about the registry
mailing list