From patrick.dowler at nrc-cnrc.gc.ca Mon May 4 22:56:08 2015 From: patrick.dowler at nrc-cnrc.gc.ca (Patrick Dowler) Date: Mon, 4 May 2015 13:56:08 -0700 Subject: prototype: scalable VOSI-tables-1.1 Message-ID: <5547DCE8.1020004@nrc-cnrc.gc.ca> I have implemented a prototype VOSI-tables-1.1 resource to deal with the issues that came up in the TAP discussion in Banff: some services have many tables and many columns and the top-level VOSI-tables document can be very large. The basic aproach is to define a RESTful resource tree following the VODataService model: tableset, schema, table. All the example URLs below work but note that the table names are fully qualified because that is how it works in TAP right now. There are separate discussions on that so it isn't really relevant here. The VOSI-tables-1.0 resource returns a document, so the "tables" resource name really means "the tableset". This remains unchanged: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables Any schema name can be used as a child; this returns a document: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA Any table name found in a schema can be used as a child; this returns a document: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA/TAP_SCHEMA.tables There is no REST binding to get a single within a
as I don't see the use case for that. The abvoe by itself is nice but doesn't solve the scalability problem. For that we need to ask for less than "all the details" of VOSI-tables-1.0 using the "detail" parameter (name taken from the VOSpace spec as it seems to be the same sort of thing: ask for a certain amount of detail. The default (as the above URLs show) is to get all the details below the resource. The detail parameter can taken two values: schema, table. To get a document with a list of only: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables?detail=schema To get a docuemnt with and
(no ): http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables?detail=table Somewhat useless*: get a document without the
(s): http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA?detail=schema To get a doccument with only the list of
(no ): http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA?detail=table To get a
document (a single table): http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA/TAP_SCHEMA.tables Somewhat useless*: get a
document without the (s): http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA/TAP_SCHEMA.tables?detail=table * it was simpler to implement this response than to come up with some logic for denying such a request; it can check the existence of the schema or table. Given the size of the documents when detail=schema or detail=table is used (very small), I don't see an obvious need for a pagination mechanism. VOSpace has such a feature so we could model something on that but I'd like to see the practical need. These would all return a 404 response code: testSchemaNotFound: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/no_such_schema testTableNotFound: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA/no_such_table testExtraPathComponentNotFound (trying to get a ): http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA/tables/table_name The base resource complies to VOSI-tables-1.0 so 1.0 clients could use a VOSI-tables-1.1 resource. The only outstanding issue is how would a service force clients to use 1.1 and forbid them from getting the whole document from the base resource? That is already an issue and right now I think the only compliant choice (for TAPvizier, for example) would be to respond with a 403 "Forbidden": only authorized callers can get the whole tableset in full detail. This is already how some services handle things like a GET to a UWS job-list. I think any oter kind of response to indicate that some of the requests are not supported would mess wit backwards compatibility. Making the default "detail" not be "all" would likewise trod on compatibility. Obviously, the tap capabilities would express that the VOSI-tables resource implemented 1.0 or 1.1 and I don't see any obvious reason to require a specific one in TAP-1.1 (at least). The change to the VOSI-tables XSD is to add and
as valid document root elements with types taken from VODataService-1.1 xsd. -- Patrick Dowler Canadian Astronomy Data Centre National Research Council Canada 5071 West Saanich Road Victoria, BC V9E 2E7 250-363-0044 (office) 250-363-0045 (fax) From dave.morris at metagrid.co.uk Tue May 5 01:22:00 2015 From: dave.morris at metagrid.co.uk (Dave Morris) Date: Tue, 05 May 2015 00:22:00 +0100 Subject: prototype: scalable VOSI-tables-1.1 In-Reply-To: <5547DCE8.1020004@nrc-cnrc.gc.ca> References: <5547DCE8.1020004@nrc-cnrc.gc.ca> Message-ID: Brief look .. but it makes sense to me. Not sure about the 403 "Forbidden" response .. but I can't think of a better way either. Unless we register it as a different capability, and then deprecate the old one next time around ? Dave -------- Dave Morris Software Developer Wide Field Astronomy Unit Institute for Astronomy University of Edinburgh -------- On 2015-05-04 21:56, Patrick Dowler wrote: > I have implemented a prototype VOSI-tables-1.1 resource to deal with > the issues that came up in the TAP discussion in Banff: some services > have many tables and many columns and the top-level VOSI-tables > document can be very large. > > > The basic aproach is to define a RESTful resource tree following the > VODataService model: tableset, schema, table. All the example URLs > below work but note that the table names are fully qualified because > that is how it works in TAP right now. There are separate discussions > on that so it isn't really relevant here. > > The VOSI-tables-1.0 resource returns a document, so the > "tables" resource name really means "the tableset". This remains > unchanged: > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables > > Any schema name can be used as a child; this returns a > document: > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA > > Any table name found in a schema can be used as a child; this returns > a
document: > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA/TAP_SCHEMA.tables > > There is no REST binding to get a single within a
as > I don't see the use case for that. The abvoe by itself is nice but > doesn't solve the scalability problem. For that we need to ask for > less than "all the details" of VOSI-tables-1.0 using the "detail" > parameter (name taken from the VOSpace spec as it seems to be the same > sort of thing: ask for a certain amount of detail. The default (as the > above > URLs show) is to get all the details below the resource. The detail > parameter can taken two values: schema, table. > > To get a document with a list of only: > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables?detail=schema > > To get a docuemnt with and
(no ): > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables?detail=table > > Somewhat useless*: get a document without the
(s): > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA?detail=schema > > To get a doccument with only the list of
(no > ): > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA?detail=table > > To get a
document (a single table): > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA/TAP_SCHEMA.tables > > Somewhat useless*: get a
document without the (s): > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA/TAP_SCHEMA.tables?detail=table > > * it was simpler to implement this response than to come up with some > logic for denying such a request; it can check the existence of the > schema or table. > > Given the size of the documents when detail=schema or detail=table is > used (very small), I don't see an obvious need for a pagination > mechanism. VOSpace has such a feature so we could model something on > that but I'd like to see the practical need. > > These would all return a 404 response code: > > testSchemaNotFound: > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/no_such_schema > > testTableNotFound: > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA/no_such_table > > testExtraPathComponentNotFound (trying to get a ): > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA/tables/table_name > > The base resource complies to VOSI-tables-1.0 so 1.0 clients could use > a VOSI-tables-1.1 resource. The only outstanding issue is how would a > service force clients to use 1.1 and forbid them from getting the > whole document from the base resource? That is already an > issue and right now I think the only compliant choice (for TAPvizier, > for example) would be to respond with a 403 "Forbidden": only > authorized callers can get the whole tableset in full detail. This is > already how some services handle things like a GET to a UWS job-list. > I think any oter kind of response to indicate that some of the > requests are not supported would mess wit backwards compatibility. > Making the default "detail" not be "all" would likewise trod on > compatibility. Obviously, the tap capabilities would express that the > VOSI-tables resource implemented 1.0 or 1.1 and I don't see any > obvious reason to require a specific one in TAP-1.1 (at least). > > The change to the VOSI-tables XSD is to add and
as > valid document root elements with types taken from VODataService-1.1 > xsd. From msdemlei at ari.uni-heidelberg.de Tue May 5 09:39:51 2015 From: msdemlei at ari.uni-heidelberg.de (Markus Demleitner) Date: Tue, 5 May 2015 09:39:51 +0200 Subject: prototype: scalable VOSI-tables-1.1 In-Reply-To: <5547DCE8.1020004@nrc-cnrc.gc.ca> References: <5547DCE8.1020004@nrc-cnrc.gc.ca> Message-ID: <20150505073951.GA18110@ari.uni-heidelberg.de> Hi, On Mon, May 04, 2015 at 01:56:08PM -0700, Patrick Dowler wrote: > The basic aproach is to define a RESTful resource tree following the > VODataService model: tableset, schema, table. All the example URLs below > work but note that the table names are fully qualified because that is how > it works in TAP right now. There are separate discussions on that so it > isn't really relevant here. Well... depends. If you're using fully qualified names, there's no strong reason to talk about schemas in your URL scheme at all, as the table names are guaranteed unique. There is a good reason not to talk about schemas: ADQL table references have one to three components, and mapping those components to a constant number of two levels on the tables endpoint may be painful with empty or magic path elements. True, the problem of a mismatch between the VODataService model and ADQL table references already exists in VODataService itself. Given that valid table references can have one to three "segments" (and one could imagine query languages with even more exotic table references behind a TAP service), I now believe it was an error to require schema in VODataService 1.1 (1.0 didn't have it), and I propose to re-allow (and even recommend) table elements as direct children of tableset there. I've just added as much on http://wiki.ivoa.net/twiki/bin/view/IVOA/VODataServiceNext -- comments are welcome there. In short, I'd like to reduce explicit modeling of naming hierachies in our protocols, and I think there's little to be gained by it here, too. So, I'd say, instead of > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA/TAP_SCHEMA.tables we should have http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA.tables or, if trolls come to haunt us, http://example.com/tap/tables/%22%2B%2Bdatabase%22/%22example%20schema%22/%22VI/23.4%22 for a table referenced by "++database"."example schema"."VI/23.4" In consequence, detail=schema could go, too, and detail would essentially become a boolean (show columns or don't show them). I still don't find detail per se pretty, but as I have no better idea to offer, I'll shut up about it. > Given the size of the documents when detail=schema or detail=table is used > (very small), I don't see an obvious need for a pagination mechanism. > VOSpace has such a feature so we could model something on that but I'd like > to see the practical need. I'm all for trying to avoid pagination as best we can -- pagination of potentially changing resources is *really* hard to get right over HTTP, as it implies transactionality over a stateless protocol. > think the only compliant choice (for TAPvizier, for example) would be to > respond with a 403 "Forbidden": only authorized callers can get the whole I like the idea of telling clients: "If you get a 403, try again with less details". Services doing that won't look good in 1.0 clients, but at least they'll still work. > The change to the VOSI-tables XSD is to add and
as valid > document root elements with types taken from VODataService-1.1 xsd. Or keep it as it is and just dump things in a tableset consisting of just one schema and one table? I don't think it's much uglier and as long as we keep the "schema change means namespace change" rule, I'd rather tread lightly there lest we unnecessarily break clients. On the other hand, if we change VODataService in parallel (and that might well be a good idea with TAP evolution), the schema would need to change anyway, and such a change would come for free. Cheers, Markus From m.b.taylor at bristol.ac.uk Tue May 5 14:15:47 2015 From: m.b.taylor at bristol.ac.uk (Mark Taylor) Date: Tue, 5 May 2015 13:15:47 +0100 (BST) Subject: prototype: scalable VOSI-tables-1.1 In-Reply-To: <20150505073951.GA18110@ari.uni-heidelberg.de> References: <5547DCE8.1020004@nrc-cnrc.gc.ca> <20150505073951.GA18110@ari.uni-heidelberg.de> Message-ID: On Tue, 5 May 2015, Markus Demleitner wrote: > Hi, > > On Mon, May 04, 2015 at 01:56:08PM -0700, Patrick Dowler wrote: > > The basic aproach is to define a RESTful resource tree following the > > VODataService model: tableset, schema, table. All the example URLs below > > work but note that the table names are fully qualified because that is how > > it works in TAP right now. There are separate discussions on that so it > > isn't really relevant here. > > Well... depends. If you're using fully qualified names, there's no > strong reason to talk about schemas in your URL scheme at all, as the > table names are guaranteed unique. > > There is a good reason not to talk about schemas: ADQL table > references have one to three components, and mapping those components > to a constant number of two levels on the tables endpoint may be > painful with empty or magic path elements. > > True, the problem of a mismatch between the VODataService model and > ADQL table references already exists in VODataService itself. > Given that valid table references can have one to three "segments" > (and one could imagine query languages with even more exotic table > references behind a TAP service), I now believe it was an error to > require schema in VODataService 1.1 (1.0 didn't have it), and I > propose to re-allow (and even recommend) table elements as direct > children of tableset there. > > I've just added as much on > http://wiki.ivoa.net/twiki/bin/view/IVOA/VODataServiceNext -- > comments are welcome there. > > In short, I'd like to reduce explicit modeling of naming hierachies > in our protocols, and I think there's little to be gained by it here, I follow the reasoning, but there is one advantage gained by the current schema/table hierarchy in tableset and TAP_SCHEMA, and that is an organisational grouping that's useful for presentation of database metadata to users. For instance the GAVO DC has 63 schemas and 138 tables; it may be more digestible for users to see a list of 63 separate data collections at top level from which they can drill down, rather than an undifferentiated bag of 138 tables (though in this case the difference of a factor of 2 is not a very strong argument). Mark -- Mark Taylor Astronomical Programmer Physics, Bristol University, UK m.b.taylor at bris.ac.uk +44-117-9288776 http://www.star.bris.ac.uk/~mbt/ From m.b.taylor at bristol.ac.uk Thu May 7 15:21:29 2015 From: m.b.taylor at bristol.ac.uk (Mark Taylor) Date: Thu, 7 May 2015 14:21:29 +0100 (BST) Subject: prototype: scalable VOSI-tables-1.1 In-Reply-To: <5547DCE8.1020004@nrc-cnrc.gc.ca> References: <5547DCE8.1020004@nrc-cnrc.gc.ca> Message-ID: Hallo Pat. On Mon, 4 May 2015, Patrick Dowler wrote: > I have implemented a prototype VOSI-tables-1.1 resource to deal with the > issues that came up in the TAP discussion in Banff: some services have many > tables and many columns and the top-level VOSI-tables document can be very > large. I've taken a look at implementing a client for your proposed VOSI-tables-1.1 interface. The general idea looks OK, but I have a couple of comments. > The basic aproach is to define a RESTful resource tree following the > VODataService model: tableset, schema, table. All the example URLs below work ... > Any table name found in a schema can be used as a child; this returns a >
document: > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA/TAP_SCHEMA.tables The hierarchical // scheme used here means that you need to know the schema name for a table in order to query the details (e.g. columns) for that table. Since, as per the recent table_name discussion on the DAL list, the table_name must already be fully qualified, i.e. lives in a flat namespace, it's not clear that this is a good idea. If you're iteratively querying the /tables endpoint from the top down to reach a table of interest, that may not matter, since you presumably have the schema/table hierarchy already[*]. However, if for instance you're trying to parse and validate some ADQL from scratch, you may only have the table_name, and no indication of what schema it lives in (unless you're allowed to pull the table name apart in order to guess, which we have established elsewhere is not reliable), so you couldn't use this service to find out the table's columns. That would argue instead for something like ?schema= ?table= rather than / // (the detail= parameter can still get appended using an ampersand separator in the usual way). [*] As it happens in my implementation code, table metadata objects don't know their parent schemas so this is a practical issue for me, but that may be down to my poor design. > There is no REST binding to get a single within a
as I don't > see the use case for that. Probably you're right, though there are cases when I want the columns without the foreign keys or vice versa, which might argue for some more detail options. However, if I have to get both columns and fkeys in either case, it's not a big deal, so the additional complication may not be warranted. > The change to the VOSI-tables XSD is to add and
as valid > document root elements with types taken from VODataService-1.1 xsd. Not necessarily. You could just require that every response from this (modified) tables endpoint still has the tableset top-level element, but only contains the elements that have been requested (e.g. the ancestor schema and tableset of a requested table, but not the sibling schemas). This would (arguably) simplify the code required for parsing these responses, and have the advantage that schema information is provided for the table, which you may not otherwise have as per my previous point. It also means no changes required to the VOSI-tables XSD. It would mean slightly more output for table requests, but probably that schema metadata is not very bulky. Finally: in topcat (not yet released, but I'll talk about it in Sesto, and working previews available if you're interested at ftp://andromeda.star.bris.ac.uk/pub/star/topcat/pre/topcat-full_tap.jar) I'm finally tackling the client-side issues that this is trying to address, i.e. acquiring and presenting to the user metadata for very large tablesets. Although I'm still experimenting, I currently use a hybrid metadata acquisition policy that uses the /tables endpoint for small services, and TAP_SCHEMA for large ones: ncol = (SELECT COUNT(*) FROM TAP_SCHEMA.columns) if ncol < 5000 slurp entire VODataService doc from /tables endpoint else read all schemas and tables, but not columns, from TAP_SCHEMA in one go read per-table column/foreign key info from TAP_SCHEMA as required It seems to work well for the services I've tested against, in particular it's OK for TAPVizier. So for my purposes, it doesn't look essential to have a scalable reworking of VOSI-tables as presented by this proposal. Of course that's not to say it's not a useful thing to have. Mark -- Mark Taylor Astronomical Programmer Physics, Bristol University, UK m.b.taylor at bris.ac.uk +44-117-9288776 http://www.star.bris.ac.uk/~mbt/ From patrick.dowler at nrc-cnrc.gc.ca Thu May 7 17:58:05 2015 From: patrick.dowler at nrc-cnrc.gc.ca (Patrick Dowler) Date: Thu, 7 May 2015 08:58:05 -0700 Subject: prototype: scalable VOSI-tables-1.1 In-Reply-To: References: <5547DCE8.1020004@nrc-cnrc.gc.ca> Message-ID: <554B8B8D.2070703@nrc-cnrc.gc.ca> Hi Mark, On 07/05/15 06:21 AM, Mark Taylor wrote: >> >The change to the VOSI-tables XSD is to add and
as valid >> >document root elements with types taken from VODataService-1.1 xsd. > Not necessarily. You could just require that every response from > this (modified) tables endpoint still has the tableset top-level > element, but only contains the elements that have been requested > (e.g. the ancestor schema and tableset of a requested table, but > not the sibling schemas). I did consider this option (it is easy to implement) but something about having the same document with stuff just missing rubs me the wrong way. And writing integration tests (sort of a client) I found different document roots to make more sense... Given that the VOSI-tables xsd does nothing but import VODataService and define which element(s) can be root, adding to it is very simple and safe. > Although I'm still experimenting, I currently > use a hybrid metadata acquisition policy that uses the /tables > endpoint for small services, and TAP_SCHEMA for large ones: > > ncol = (SELECT COUNT(*) FROM TAP_SCHEMA.columns) > if ncol < 5000 > slurp entire VODataService doc from /tables endpoint > else > read all schemas and tables, but not columns, from TAP_SCHEMA in one go > read per-table column/foreign key info from TAP_SCHEMA as required The count query seems a good way to gauge the amount of content. The "else" could be accomlished with VOSI-tables-1.1 with http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables?detail=table which gives you all the schema and table names (easier than 2 queries to TAP_SCHEMA? I guess so) followed by calls like http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA/TAP_SCHEMA.tables to get individual table metadata. Whether VOSI-tables documents are easier or harder to parse than TAP_SCHEMA query output... I guess that's up to you. -- Patrick Dowler Canadian Astronomy Data Centre National Research Council Canada 5071 West Saanich Road Victoria, BC V9E 2E7 250-363-0044 (office) 250-363-0045 (fax) From m.b.taylor at bristol.ac.uk Thu May 7 18:12:06 2015 From: m.b.taylor at bristol.ac.uk (Mark Taylor) Date: Thu, 7 May 2015 17:12:06 +0100 (BST) Subject: prototype: scalable VOSI-tables-1.1 In-Reply-To: <554B8B8D.2070703@nrc-cnrc.gc.ca> References: <5547DCE8.1020004@nrc-cnrc.gc.ca> <554B8B8D.2070703@nrc-cnrc.gc.ca> Message-ID: On Thu, 7 May 2015, Patrick Dowler wrote: > > Hi Mark, > > > On 07/05/15 06:21 AM, Mark Taylor wrote: > > > >The change to the VOSI-tables XSD is to add and
as valid > > > >document root elements with types taken from VODataService-1.1 xsd. > > Not necessarily. You could just require that every response from > > this (modified) tables endpoint still has the tableset top-level > > element, but only contains the elements that have been requested > > (e.g. the ancestor schema and tableset of a requested table, but > > not the sibling schemas). > > I did consider this option (it is easy to implement) but something about > having the same document with stuff just missing rubs me the wrong way. And > writing integration tests (sort of a client) I found different document roots > to make more sense... I think it rubs me the right way :-), and would probably require fewer changes to my existing code - but I don't have very strong feelings, could live with either. > Given that the VOSI-tables xsd does nothing but import VODataService and > define which element(s) can be root, adding to it is very simple and safe. Agreed. > > Although I'm still experimenting, I currently > > use a hybrid metadata acquisition policy that uses the /tables > > endpoint for small services, and TAP_SCHEMA for large ones: > > > > ncol = (SELECT COUNT(*) FROM TAP_SCHEMA.columns) > > if ncol < 5000 > > slurp entire VODataService doc from /tables endpoint > > else > > read all schemas and tables, but not columns, from TAP_SCHEMA in one go > > read per-table column/foreign key info from TAP_SCHEMA as required > > The count query seems a good way to gauge the amount of content. The "else" > could be accomlished with VOSI-tables-1.1 with > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables?detail=table > > which gives you all the schema and table names (easier than 2 queries to > TAP_SCHEMA? I guess so) followed by calls like > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA/TAP_SCHEMA.tables > > to get individual table metadata. Whether VOSI-tables documents are easier or > harder to parse than TAP_SCHEMA query output... I guess that's up to you. Yes, probably your extended VOSI-tables route is easier to write client code for than doing it with multiple TAP_SCHEMA TAP queries, as long as you can rely on it being present. And it may have different (better?) performance characteristics too. I just wanted to note that I have found it *possible* to write a working client metadata presentation GUI using only existing standards. -- Mark Taylor Astronomical Programmer Physics, Bristol University, UK m.b.taylor at bris.ac.uk +44-117-9288776 http://www.star.bris.ac.uk/~mbt/ From majorb at nrc-cnrc.gc.ca Fri May 8 23:34:26 2015 From: majorb at nrc-cnrc.gc.ca (Brian Major) Date: Fri, 8 May 2015 14:34:26 -0700 Subject: VOSpace: Content-Length on transfer negotiation In-Reply-To: <396820928647f140da26d3d259151815@metagrid.co.uk> References: <396820928647f140da26d3d259151815@metagrid.co.uk> Message-ID: Hi Dave, Thanks for your feedback. I think I prefer your option #2 below. Content-length (or byteCount, to fit better with the existing keepBytes) seems independent of the protocol(s) of the transfer. Brian On Fri, Feb 6, 2015 at 7:16 AM, Dave Morris wrote: > I agree with the concept. > > I'd like to suggest a couple of alternative implementations. > > 1) Could this be passed as a property of the protocol? If so, then this > could be done now without any changes to the XML schema. Disadvantage is > that it would have to be repeated if multiple protocols were > offered/selected. > > 2) If we are going to change the XML schema, rather than adding a > specific element for this case, we add a generic element to > which would enable us to add more things like this in the > future without having to change the XML schema every time. > > Using the property model would allow us to define a byte-count property > for files now, and a row-count property for database tables later. > > Dave > > -------- > Dave Morris > Software Developer > Wide Field Astronomy Unit > Institute for Astronomy > University of Edinburgh > -------- > > On 2015-02-04 21:34, Brian Major wrote: > > Hello grid listeners, > > > > I'd like to propose the addition of an optional content-length field to > > the > > transfer document used to request endpoints for moving data to and from > > VOSpace. However, this would only be applicable when the requested > > transfer direction is either pushToVoSpace and pullToVoSpace. > > > > VOSpace implementations must produce a set of transfer endpoints. In a > > distributed storage system, the decision of creating endpoints for data > > storage would be more informed if the system knew the size of the file > > the > > client was intending to upload. For example, a large file may only fit > > in > > a certain physical location. > > > > This new field would not be meant to be used in the actual transfer > > process > > to confirm that the entire file has been received--that would remain > > the > > responsibility of the transfer protocol (HTTP, FTP, etc...). > > > > Regards, > > Brian > -- Brian Major Canadian Astronomy Data Centre Centre canadien de donn?es astronomiques National Research Council Canada Conseil national de recherches Canada -------------- next part -------------- An HTML attachment was scrubbed... URL: From andre.schaaff at astro.unistra.fr Fri May 15 07:55:54 2015 From: andre.schaaff at astro.unistra.fr (Andre Schaaff) Date: Fri, 15 May 2015 07:55:54 +0200 Subject: Second call for the GWS sessions in Sesto Message-ID: <55558A6A.6000701@astro.unistra.fr> Dear IVOA members, We are pleased to invite you to propose a talk during the Grid and Web Services working group sessions in Sesto. It is open to all the topics which are in the scope of the WG: clouds (storage & computing), workflows, proposal for new standards, etc. Send your proposal (a title and maybe a short abstract ) to Brian and me. During these sessions we will also discuss the status of VOSpace and SSO updates. (concerning UWS we hope to bring it to the PR status for Sesto) Regards, Andre Schaaff and Brian Major IVOA Grid and Web Services WG From gmantele at ari.uni-heidelberg.de Fri May 15 11:12:48 2015 From: gmantele at ari.uni-heidelberg.de (=?windows-1252?Q?Gr=E9gory_Mantelet?=) Date: Fri, 15 May 2015 11:12:48 +0200 Subject: New Release CDS/ARI Libraries: UWS, ADQL, TAP Message-ID: <5555B890.3010503@ari.uni-heidelberg.de> Hello Apps and Grid lists, New release of the CDS/ARI's TAP, UWS and ADQL libraries are finally available! They are respectively version 2.0, 4.1 and 1.3. You can download them here: - http://cdsportal.u-strasbg.fr/taptuto - http://cdsportal.u-strasbg.fr/uwstuto - http://cdsportal.u-strasbg.fr/adqltuto PS: Java >= 1.6 is required. About the TAP Library ------------------------------ It is now about a year that I am working to make it more stable and conform to the IVOA Recommendation. Thanks to feedbacks of several users, a lot of bugs have been fixed. Some parts, like UPLOAD, have been re-designed properly. And new features are also available, like: * TAP configuration file (it lets create a TAP service with just one text file ; no more need to write 3 classes + 1 servlet) * 2 new metadata definition methods (1/ an XML file (same content as VOSI-tables), 2/ import from the database schema TAP_SCHEMA) * Result formatting with STIL...and additional formats (e.g. FITS, HTML) are also provided * Better STC-S support * Easier declaration of UDFs * etc... A longer list is available at http://cdsportal.u-strasbg.fr/taptuto/news.html , but considering the amount of modifications, even this list is not complete ; if you really want to see all of them, I invite you to take a look to the GitHub project: http://github.com/gmantele/taplib The full documentation of this TAP Library is not yet complete (only the part about the TAP configuration file is there), but it will be updated little by little in the next months. A Getting Started page and the Javadoc are however available and will help you creating a TAP service in few steps. If you used the previous version, you should know that the new version is unfortunately not backward compatible ; it was indeed needed to get a stable and easier-to-use library. However, you will find a migration help at the following URL to help you performing the migration as easily as possible: http://cdsportal.u-strasbg.fr/taptuto/migration1to2.html About the UWS and ADQL Libraries ------------------------------------------------ In addition of this new TAP Library release, there are also a new sub-version for the UWS and ADQL libraries: respectively 4.1 and 1.3. They are included in the last release of the TAP Library. Both of them are backward compatible with respectively 4.x and 1.x versions. Modifications are mainly bug corrections and addition of some features. /!\ Warning about the UWS Library: No documentation nor Getting Started are yet available! As for the TAP Library, it will be completed little by little in the next months. While waiting you can take a look to the examples provided on the web-site. On the contrary, the ADQL Library still have its full and updated documentation. -------------------------------------------------- Voil?! I hope these new releases will help those who were waiting for them for a so long time and I hope it will be helpful for those who did not know them yet. In all cases, if you have some questions or need some help concerning these three libraries, I will be glad to answer you as fast as I can. Cheers, Gr?gory Mantelet ------------------------------------------------------------------------- gmantele at ari.uni-heidelberg.de Astronomisches Rechen Institut (ARI) M?nchhofstr. 12-14 - 69120 Heidelberg - Germany -------------- next part -------------- An HTML attachment was scrubbed... URL: From majorb at nrc-cnrc.gc.ca Mon May 18 19:09:32 2015 From: majorb at nrc-cnrc.gc.ca (Brian Major) Date: Mon, 18 May 2015 10:09:32 -0700 Subject: IVOA Schema evolution Message-ID: Hi Andr?, grid, Could we set aside some time for a discussion aimed at formalizing the IVOA's approach to schema evolution during our meetings in Sesto? This topic has come up a number of times in GWS over the last year. For example, see Document Versioning and XML schema: http://mail.ivoa.net/pipermail/grid/2014-June/thread.html#2622 Paul, Dave and others have suggested this can documented as an official IVOA note. Having some resolution on the topic would give us some much needed guidance on the rules for producing major and minor document versions. Cheers, Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.harrison at manchester.ac.uk Tue May 19 10:57:23 2015 From: paul.harrison at manchester.ac.uk (Paul Harrison) Date: Tue, 19 May 2015 08:57:23 +0000 Subject: UWS 1.1 Message-ID: <3BCA40AA-0574-46EC-BDD7-5F7562118DF7@manchester.ac.uk> Hi, We are hoping to advance UWS 1.1 to PR status at (or perhaps before) the Sesto Interop Meeting. I believe that there are now several implementations that are being updated to adhere to the new version. It has been brought to my attention that the UWS 1.1 schema was not available in the text of the last WD that was published to the IVOA document store. For those that need a copy, it is available in volute https://code.google.com/p/volute/source/browse/trunk/projects/grid/uws/doc/UWS.xsd In addition I have made some small edits to the standard document - in preparation for PR - again this version is as yet only available in volute http://volute.googlecode.com/svn/trunk/projects/grid/uws/doc/UWS.html I believe that there is only one outstanding issue in this document in section 2.2.1.1 (http://volute.googlecode.com/svn/trunk/projects/grid/uws/doc/UWS.html#blocking) highlighted in yellow. It concerns how other standards might signal which version of UWS that they are using. My current feeling is to remove this text entirely and leave it up to the other standard documents to signal the mechanism that they want to use. I think that part of the cause for this versioning issue to come up was the lack of clarity of the namespace for the UWS schema - which is now addressed by the release of a schema with a new 1.1 namespace - therefore it is safe to remove this issue from the UWS specification. Any objections? As has recently been mentioned on the lists, we hope to have a session in Sesto on the topic of versioning of schema to form some guidelines for good practice. Regards, Paul. From m.b.taylor at bristol.ac.uk Thu May 21 12:56:23 2015 From: m.b.taylor at bristol.ac.uk (Mark Taylor) Date: Thu, 21 May 2015 11:56:23 +0100 (BST) Subject: prototype: scalable VOSI-tables-1.1 In-Reply-To: <5547DCE8.1020004@nrc-cnrc.gc.ca> References: <5547DCE8.1020004@nrc-cnrc.gc.ca> Message-ID: Pat et al., FYI, I've implemented a GUI client for this proposed protocol. It works, though I still have my reservations about the /tables// URL scheme for the resources. The TOPCAT TAP client now has a pluggable metadata acquisition component that can pursue various strategies to get the table metadata, currently: - standard /tables endpoint - multiple TAP_SCHEMA queries - VizieR-flavour non-standard multi-level /tables endpoint - CADC-flavour non-standard multi-level /tables endpoint By default it picks one it thinks will do the best job ("Auto"). Both the VizieR and CADC /tables endpoints pull all schemas and tables at first, and then get per-table content only as required. If you want to see it working, you can find a pre-release version here: ftp://andromeda.star.bris.ac.uk/pub/star/topcat/pre/topcat-full_tap.jar To get the TAP window, use the "Table Access Protocol (TAP) Query" menu item in the top-level "VO" menu, or the equivalent toolbar button. The "TAP|Metadata Reading" menu item in that window allows you to choose which metadata acquisition strategy is in use (changing the value triggers a metadata reload). You can run with -verbose or -verbose -verbose to get a bit more information about what queries are going on under the hood. Mark On Mon, 4 May 2015, Patrick Dowler wrote: > > I have implemented a prototype VOSI-tables-1.1 resource to deal with the > issues that came up in the TAP discussion in Banff: some services have many > tables and many columns and the top-level VOSI-tables document can be very > large. > > > The basic aproach is to define a RESTful resource tree following the > VODataService model: tableset, schema, table. All the example URLs below work > but note that the table names are fully qualified because that is how it works > in TAP right now. There are separate discussions on that so it isn't really > relevant here. > > The VOSI-tables-1.0 resource returns a document, so the "tables" > resource name really means "the tableset". This remains unchanged: > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables > > Any schema name can be used as a child; this returns a document: > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA > > Any table name found in a schema can be used as a child; this returns a >
document: > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA/TAP_SCHEMA.tables > > There is no REST binding to get a single within a
as I don't > see the use case for that. The abvoe by itself is nice but doesn't solve the > scalability problem. For that we need to ask for less than "all the details" > of VOSI-tables-1.0 using the "detail" parameter (name taken from the VOSpace > spec as it seems to be the same sort of thing: ask for a certain amount of > detail. The default (as the above > URLs show) is to get all the details below the resource. The detail parameter > can taken two values: schema, table. > > To get a document with a list of only: > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables?detail=schema > > To get a docuemnt with and
(no ): > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables?detail=table > > Somewhat useless*: get a document without the
(s): > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA?detail=schema > > To get a doccument with only the list of
(no ): > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA?detail=table > > To get a
document (a single table): > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA/TAP_SCHEMA.tables > > Somewhat useless*: get a
document without the (s): > > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA/TAP_SCHEMA.tables?detail=table > > * it was simpler to implement this response than to come up with some logic > for denying such a request; it can check the existence of the schema or table. > > Given the size of the documents when detail=schema or detail=table is used > (very small), I don't see an obvious need for a pagination mechanism. VOSpace > has such a feature so we could model something on that but I'd like to see the > practical need. > > These would all return a 404 response code: > > testSchemaNotFound: > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/no_such_schema > > testTableNotFound: > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA/no_such_table > > testExtraPathComponentNotFound (trying to get a ): > http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/tap/tables/TAP_SCHEMA/tables/table_name > > The base resource complies to VOSI-tables-1.0 so 1.0 clients could use a > VOSI-tables-1.1 resource. The only outstanding issue is how would a service > force clients to use 1.1 and forbid them from getting the whole > document from the base resource? That is already an issue and right now I > think the only compliant choice (for TAPvizier, for example) would be to > respond with a 403 "Forbidden": only authorized callers can get the whole > tableset in full detail. This is already how some services handle things like > a GET to a UWS job-list. I think any oter kind of response to indicate that > some of the requests are not supported would mess wit backwards compatibility. > Making the default "detail" not be "all" would likewise trod on compatibility. > Obviously, the tap capabilities would express that the VOSI-tables resource > implemented 1.0 or 1.1 and I don't see any obvious reason to require a > specific one in TAP-1.1 (at least). > > The change to the VOSI-tables XSD is to add and
as valid > document root elements with types taken from VODataService-1.1 xsd. > > > > -- > > Patrick Dowler > Canadian Astronomy Data Centre > National Research Council Canada > 5071 West Saanich Road > Victoria, BC V9E 2E7 > > 250-363-0044 (office) 250-363-0045 (fax) > -- Mark Taylor Astronomical Programmer Physics, Bristol University, UK m.b.taylor at bris.ac.uk +44-117-9288776 http://www.star.bris.ac.uk/~mbt/ From m.b.taylor at bristol.ac.uk Thu May 21 14:50:14 2015 From: m.b.taylor at bristol.ac.uk (Mark Taylor) Date: Thu, 21 May 2015 13:50:14 +0100 (BST) Subject: UWS 1.1 In-Reply-To: <3BCA40AA-0574-46EC-BDD7-5F7562118DF7@manchester.ac.uk> References: <3BCA40AA-0574-46EC-BDD7-5F7562118DF7@manchester.ac.uk> Message-ID: Hi grid, I'm interested in implementing a UWS-1.1-aware client in order to improve the way that I do blocking to wait for an asynchronous TAP response (sec 2.2.1.1). Are there some deployed TAP services out there that implement UWS-1.1 blocking behaviour? Preferably using the UWS 1.1 namespace? I'd like to test client behaviour against them. Thanks Mark On Tue, 19 May 2015, Paul Harrison wrote: > Hi, > > We are hoping to advance UWS 1.1 to PR status at (or perhaps before) the Sesto Interop Meeting. I believe that there are now several implementations that are being updated to adhere to the new version. It has been brought to my attention that the UWS 1.1 schema was not available in the text of the last WD that was published to the IVOA document store. For those that need a copy, it is available in volute > > https://code.google.com/p/volute/source/browse/trunk/projects/grid/uws/doc/UWS.xsd > > In addition I have made some small edits to the standard document - in preparation for PR - again this version is as yet only available in volute > > http://volute.googlecode.com/svn/trunk/projects/grid/uws/doc/UWS.html > > I believe that there is only one outstanding issue in this document in section 2.2.1.1 (http://volute.googlecode.com/svn/trunk/projects/grid/uws/doc/UWS.html#blocking) highlighted in yellow. It concerns how other standards might signal which version of UWS that they are using. My current feeling is to remove this text entirely and leave it up to the other standard documents to signal the mechanism that they want to use. I think that part of the cause for this versioning issue to come up was the lack of clarity of the namespace for the UWS schema - which is now addressed by the release of a schema with a new 1.1 namespace - therefore it is safe to remove this issue from the UWS specification. Any objections? > > As has recently been mentioned on the lists, we hope to have a session in Sesto on the topic of versioning of schema to form some guidelines for good practice. > > > Regards, > Paul. > > > > -- Mark Taylor Astronomical Programmer Physics, Bristol University, UK m.b.taylor at bris.ac.uk +44-117-9288776 http://www.star.bris.ac.uk/~mbt/ From patrick.dowler at nrc-cnrc.gc.ca Thu May 21 22:22:49 2015 From: patrick.dowler at nrc-cnrc.gc.ca (Patrick Dowler) Date: Thu, 21 May 2015 13:22:49 -0700 Subject: UWS 1.1 In-Reply-To: References: <3BCA40AA-0574-46EC-BDD7-5F7562118DF7@manchester.ac.uk> Message-ID: <555E3E99.4060603@nrc-cnrc.gc.ca> I have (crudely) implemented this and it should go live in a day or so... will announce it here when it is available. One thing that came to mind while implementing this is that it is really quite complex if you have load-balanced web servers. As such, I ended up using internal polling in case events don't get propagated(*) from the server running the job to the server someone is blocking on... but I don't know how responsive the client wants. It seems it would be nice if the client could say "if I had to do it, I would poll every 300ms but I'm being nice and WAIT=30 (sec)"... The best solution is to have a distributed event mechanism on the server side but I can see that not having a great cost:benefit ratio. Maybe if the client could give a hint about responsiveness then server-side polling could be tuned sufficiently... WAIT=30&THREATEN-TO-POLL=100ms ? :-) * meaning that right now I don't do this at all :-) Pat On 21/05/15 05:50 AM, Mark Taylor wrote: > Hi grid, > > I'm interested in implementing a UWS-1.1-aware client in order to > improve the way that I do blocking to wait for an asynchronous > TAP response (sec 2.2.1.1). > > Are there some deployed TAP services out there that implement > UWS-1.1 blocking behaviour? Preferably using the UWS 1.1 namespace? > I'd like to test client behaviour against them. > > Thanks > > Mark > > On Tue, 19 May 2015, Paul Harrison wrote: > >> Hi, >> >> We are hoping to advance UWS 1.1 to PR status at (or perhaps before) the Sesto Interop Meeting. I believe that there are now several implementations that are being updated to adhere to the new version. It has been brought to my attention that the UWS 1.1 schema was not available in the text of the last WD that was published to the IVOA document store. For those that need a copy, it is available in volute >> >> https://code.google.com/p/volute/source/browse/trunk/projects/grid/uws/doc/UWS.xsd >> >> In addition I have made some small edits to the standard document - in preparation for PR - again this version is as yet only available in volute >> >> http://volute.googlecode.com/svn/trunk/projects/grid/uws/doc/UWS.html >> >> I believe that there is only one outstanding issue in this document in section 2.2.1.1 (http://volute.googlecode.com/svn/trunk/projects/grid/uws/doc/UWS.html#blocking) highlighted in yellow. It concerns how other standards might signal which version of UWS that they are using. My current feeling is to remove this text entirely and leave it up to the other standard documents to signal the mechanism that they want to use. I think that part of the cause for this versioning issue to come up was the lack of clarity of the namespace for the UWS schema - which is now addressed by the release of a schema with a new 1.1 namespace - therefore it is safe to remove this issue from the UWS specification. Any objections? >> >> As has recently been mentioned on the lists, we hope to have a session in Sesto on the topic of versioning of schema to form some guidelines for good practice. >> >> >> Regards, >> Paul. >> >> >> >> > > -- > Mark Taylor Astronomical Programmer Physics, Bristol University, UK > m.b.taylor at bris.ac.uk +44-117-9288776 http://www.star.bris.ac.uk/~mbt/ > -- Patrick Dowler Canadian Astronomy Data Centre National Research Council Canada 5071 West Saanich Road Victoria, BC V9E 2E7 250-363-0044 (office) 250-363-0045 (fax) From M.B.Taylor at bristol.ac.uk Thu May 21 23:10:05 2015 From: M.B.Taylor at bristol.ac.uk (Mark Taylor) Date: Thu, 21 May 2015 22:10:05 +0100 (BST) Subject: UWS 1.1 In-Reply-To: <555E3E99.4060603@nrc-cnrc.gc.ca> References: <3BCA40AA-0574-46EC-BDD7-5F7562118DF7@manchester.ac.uk> <555E3E99.4060603@nrc-cnrc.gc.ca> Message-ID: Not sure if this helps, but for my usage purposes I'd consider anything within a second to be pretty good service. I typically poll every 5 seconds for TAP jobs (should really back off after a while but I don't). On Thu, 21 May 2015, Patrick Dowler wrote: > > I have (crudely) implemented this and it should go live in a day or so... will > announce it here when it is available. > > One thing that came to mind while implementing this is that it is really quite > complex if you have load-balanced web servers. As such, I ended up using > internal polling in case events don't get propagated(*) from the server > running the job to the server someone is blocking on... but I don't know how > responsive the client wants. It seems it would be nice if the client could say > "if I had to do it, I would poll every 300ms but I'm being nice and WAIT=30 > (sec)"... The best solution is to have a distributed event mechanism on the > server side but I can see that not having a great cost:benefit ratio. Maybe if > the client could give a hint about responsiveness then server-side polling > could be tuned sufficiently... WAIT=30&THREATEN-TO-POLL=100ms ? :-) > > * meaning that right now I don't do this at all :-) > > Pat > > > On 21/05/15 05:50 AM, Mark Taylor wrote: > > Hi grid, > > > > I'm interested in implementing a UWS-1.1-aware client in order to > > improve the way that I do blocking to wait for an asynchronous > > TAP response (sec 2.2.1.1). > > > > Are there some deployed TAP services out there that implement > > UWS-1.1 blocking behaviour? Preferably using the UWS 1.1 namespace? > > I'd like to test client behaviour against them. > > > > Thanks > > > > Mark > > > > On Tue, 19 May 2015, Paul Harrison wrote: > > > > > Hi, > > > > > > We are hoping to advance UWS 1.1 to PR status at (or perhaps before) the > > > Sesto Interop Meeting. I believe that there are now several > > > implementations that are being updated to adhere to the new version. It > > > has been brought to my attention that the UWS 1.1 schema was not available > > > in the text of the last WD that was published to the IVOA document store. > > > For those that need a copy, it is available in volute > > > > > > https://code.google.com/p/volute/source/browse/trunk/projects/grid/uws/doc/UWS.xsd > > > > > > In addition I have made some small edits to the standard document - in > > > preparation for PR - again this version is as yet only available in volute > > > > > > http://volute.googlecode.com/svn/trunk/projects/grid/uws/doc/UWS.html > > > > > > I believe that there is only one outstanding issue in this document in > > > section 2.2.1.1 > > > (http://volute.googlecode.com/svn/trunk/projects/grid/uws/doc/UWS.html#blocking) > > > highlighted in yellow. It concerns how other standards might signal which > > > version of UWS that they are using. My current feeling is to remove this > > > text entirely and leave it up to the other standard documents to signal > > > the mechanism that they want to use. I think that part of the cause for > > > this versioning issue to come up was the lack of clarity of the namespace > > > for the UWS schema - which is now addressed by the release of a schema > > > with a new 1.1 namespace - therefore it is safe to remove this issue from > > > the UWS specification. Any objections? > > > > > > As has recently been mentioned on the lists, we hope to have a session in > > > Sesto on the topic of versioning of schema to form some guidelines for > > > good practice. > > > > > > > > > Regards, > > > Paul. > > > > > > > > > > > > > > > > -- > > Mark Taylor Astronomical Programmer Physics, Bristol University, UK > > m.b.taylor at bris.ac.uk +44-117-9288776 http://www.star.bris.ac.uk/~mbt/ > > > > -- > > Patrick Dowler > Canadian Astronomy Data Centre > National Research Council Canada > 5071 West Saanich Road > Victoria, BC V9E 2E7 > > 250-363-0044 (office) 250-363-0045 (fax) > -- Mark Taylor Astronomical Programmer Physics, Bristol University, UK m.b.taylor at bris.ac.uk +44-117-9288776 http://www.star.bris.ac.uk/~mbt/ From patrick.dowler at nrc-cnrc.gc.ca Fri May 22 23:14:57 2015 From: patrick.dowler at nrc-cnrc.gc.ca (Patrick Dowler) Date: Fri, 22 May 2015 14:14:57 -0700 Subject: UWS 1.1 In-Reply-To: References: <3BCA40AA-0574-46EC-BDD7-5F7562118DF7@manchester.ac.uk> <555E3E99.4060603@nrc-cnrc.gc.ca> Message-ID: <555F9C51.7010507@nrc-cnrc.gc.ca> Yes, that helps. I ended up implementing server-side polling (of the phase in the job db) with escalating interval: 1, 2, 4, 8... then staying at 8 (sec). I don't think I can feasibly make another choice since I have no idea how long the job will take. The 8 sec is quite large compared to the ~15ms to pull the current job state from the database so not much point in making it bigger. And it is progressively small compared to how long the job has been running so far. This seems pretty decent for most situations.... Except: Under heavy load, our apache-ajp-tomcat setup hits a limit and new connections are held by apache until some ajp resources are available. That means we generally do not want connections to hang around for a long time consuming that limited resource, so my current implementation has a maximum WAIT time. The normal result is to return the current state of the job after that time, which would force the client to check and possibly redo the get-and-wait. The max is 60 sec. Would you expect a normal return without a phase change (eventually) or should we just close the connection and let client error handling retry (like it would for a timeout)? I think just returning the job document in the current (unchanged) state seems right and it seems to me the server must be able to place a limit since this does consume resources. * side issue with long waits: as far as I know, there is no good way with http to determine if the client is still there waiting without writing some output. In theory, I could probably set the response headers and write a single white space every so often (without making the document invalid) to probe that output stream, but I'm not sure that will fail until I fill up at last one unflushable buffer somewhere... further argument that the server should be able to place a wait limit. Pat On 21/05/15 02:10 PM, Mark Taylor wrote: > Not sure if this helps, but for my usage purposes I'd consider anything > within a second to be pretty good service. I typically poll every 5 > seconds for TAP jobs (should really back off after a while but I don't). > > On Thu, 21 May 2015, Patrick Dowler wrote: > >> >> I have (crudely) implemented this and it should go live in a day or so... will >> announce it here when it is available. >> >> One thing that came to mind while implementing this is that it is really quite >> complex if you have load-balanced web servers. As such, I ended up using >> internal polling in case events don't get propagated(*) from the server >> running the job to the server someone is blocking on... but I don't know how >> responsive the client wants. It seems it would be nice if the client could say >> "if I had to do it, I would poll every 300ms but I'm being nice and WAIT=30 >> (sec)"... The best solution is to have a distributed event mechanism on the >> server side but I can see that not having a great cost:benefit ratio. Maybe if >> the client could give a hint about responsiveness then server-side polling >> could be tuned sufficiently... WAIT=30&THREATEN-TO-POLL=100ms ? :-) >> >> * meaning that right now I don't do this at all :-) >> >> Pat >> >> >> On 21/05/15 05:50 AM, Mark Taylor wrote: >>> Hi grid, >>> >>> I'm interested in implementing a UWS-1.1-aware client in order to >>> improve the way that I do blocking to wait for an asynchronous >>> TAP response (sec 2.2.1.1). >>> >>> Are there some deployed TAP services out there that implement >>> UWS-1.1 blocking behaviour? Preferably using the UWS 1.1 namespace? >>> I'd like to test client behaviour against them. >>> >>> Thanks >>> >>> Mark >>> >>> On Tue, 19 May 2015, Paul Harrison wrote: >>> >>>> Hi, >>>> >>>> We are hoping to advance UWS 1.1 to PR status at (or perhaps before) the >>>> Sesto Interop Meeting. I believe that there are now several >>>> implementations that are being updated to adhere to the new version. It >>>> has been brought to my attention that the UWS 1.1 schema was not available >>>> in the text of the last WD that was published to the IVOA document store. >>>> For those that need a copy, it is available in volute >>>> >>>> https://code.google.com/p/volute/source/browse/trunk/projects/grid/uws/doc/UWS.xsd >>>> >>>> In addition I have made some small edits to the standard document - in >>>> preparation for PR - again this version is as yet only available in volute >>>> >>>> http://volute.googlecode.com/svn/trunk/projects/grid/uws/doc/UWS.html >>>> >>>> I believe that there is only one outstanding issue in this document in >>>> section 2.2.1.1 >>>> (http://volute.googlecode.com/svn/trunk/projects/grid/uws/doc/UWS.html#blocking) >>>> highlighted in yellow. It concerns how other standards might signal which >>>> version of UWS that they are using. My current feeling is to remove this >>>> text entirely and leave it up to the other standard documents to signal >>>> the mechanism that they want to use. I think that part of the cause for >>>> this versioning issue to come up was the lack of clarity of the namespace >>>> for the UWS schema - which is now addressed by the release of a schema >>>> with a new 1.1 namespace - therefore it is safe to remove this issue from >>>> the UWS specification. Any objections? >>>> >>>> As has recently been mentioned on the lists, we hope to have a session in >>>> Sesto on the topic of versioning of schema to form some guidelines for >>>> good practice. >>>> >>>> >>>> Regards, >>>> Paul. >>>> >>>> >>>> >>>> >>> >>> -- >>> Mark Taylor Astronomical Programmer Physics, Bristol University, UK >>> m.b.taylor at bris.ac.uk +44-117-9288776 http://www.star.bris.ac.uk/~mbt/ >>> >> >> -- >> >> Patrick Dowler >> Canadian Astronomy Data Centre >> National Research Council Canada >> 5071 West Saanich Road >> Victoria, BC V9E 2E7 >> >> 250-363-0044 (office) 250-363-0045 (fax) >> > > -- > Mark Taylor Astronomical Programmer Physics, Bristol University, UK > m.b.taylor at bris.ac.uk +44-117-9288776 http://www.star.bris.ac.uk/~mbt/ > . > -- Patrick Dowler Canadian Astronomy Data Centre National Research Council Canada 5071 West Saanich Road Victoria, BC V9E 2E7 250-363-0044 (office) 250-363-0045 (fax) From gmantele at ari.uni-heidelberg.de Tue May 26 10:58:54 2015 From: gmantele at ari.uni-heidelberg.de (=?windows-1252?Q?Gr=E9gory_Mantelet?=) Date: Tue, 26 May 2015 10:58:54 +0200 Subject: UWS 1.1 In-Reply-To: <555F9C51.7010507@nrc-cnrc.gc.ca> References: <3BCA40AA-0574-46EC-BDD7-5F7562118DF7@manchester.ac.uk> <555E3E99.4060603@nrc-cnrc.gc.ca> <555F9C51.7010507@nrc-cnrc.gc.ca> Message-ID: <556435CE.7060206@ari.uni-heidelberg.de> Hello Pat, Mark and Grid, UWS 1.1 will be also implemented in the CDS/ARI's UWS Library. I have still few things to test or finish on it, but basically job list filtering and the blocking behaviour are working. Developments are still in progress but available on GitHub in the branch "uws1.1": https://github.com/gmantele/taplib/tree/uws1.1 @Mark: if you want, I can provide you a .war archive with a very simple uws that you could deploy locally in order to test Topcat on it. About the blocking behaviour, I have two comments: First, the UWS Lib. is managing jobs completely in memory. This lets be notified immediately about a phase change thanks to a listener (already in place in the lib. for a while now). Thus, no polling is performed. Second, and it is the most important point for me: the client abortion. As Pat said, there is no way to detect client abortion (i.e. the client is not waiting any more for a server response)....at least using the Servlet API in Java. A such detection would be possible using directly sockets, but this part is hidden by the web application server (e.g. Tomcat). It is now several days I am spending to find a solution, but the only work-around I found was to send after a while a redirection to the same request (with the ?WAIT parameter ; with a reduced waiting time if some was set). In this way, if the client gave up, the redirection is not done, and otherwise the server receives a new waiting request and executes it. In both cases, the original waiting request is finished on server side. Well, this is quite fine if the client is a browser - it is totally transparent - but another client (like Topcat) should know about that trick. Honestly, I am not keen with this idea although it limits the resources consumption on server side by avoiding real unlimited wait but by doing like it does. The other solution, as Pat said, is to set a timeout (= max server waiting time). I would certainly do the same, except if the "request redirection" solution is fine for you. By the way, I also want to briefly point out that the same problem (client abortion detection) exists also for TAP synchronous requests. To conclude this email, I would like to know whether I have understood well about the ARCHIVED phase. A job is going in this phase ONLY IF the destruction time is reached. Is it right or is there a manual way for a user to put a job in this phase? Then, I suppose that a normal ?action=delete request is enough to anyway really delete this job, isn't it ? Regards, Gr?gory On 22.05.2015 23:14, Patrick Dowler wrote: > > Yes, that helps. I ended up implementing server-side polling (of the > phase in the job db) with escalating interval: 1, 2, 4, 8... then > staying at 8 (sec). I don't think I can feasibly make another choice > since I have no idea how long the job will take. > > The 8 sec is quite large compared to the ~15ms to pull the current job > state from the database so not much point in making it bigger. And it > is progressively small compared to how long the job has been running > so far. This seems pretty decent for most situations.... > > Except: Under heavy load, our apache-ajp-tomcat setup hits a limit and > new connections are held by apache until some ajp resources are > available. That means we generally do not want connections to hang > around for a long time consuming that limited resource, so my current > implementation has a maximum WAIT time. The normal result is to return > the current state of the job after that time, which would force the > client to check and possibly redo the get-and-wait. The max is 60 sec. > Would you expect a normal return without a phase change (eventually) > or should we just close the connection and let client error handling > retry > (like it would for a timeout)? I think just returning the job document > in the current (unchanged) state seems right and it seems to me the > server must be able to place a limit since this does consume resources. > > * side issue with long waits: as far as I know, there is no good way > with http to determine if the client is still there waiting without > writing some output. In theory, I could probably set the response > headers and write a single white space every so often (without making > the document invalid) to probe that output stream, but I'm not sure > that will fail until I fill up at last one unflushable buffer > somewhere... further argument that the server should be able to place > a wait limit. > > > Pat > > > > On 21/05/15 02:10 PM, Mark Taylor wrote: >> Not sure if this helps, but for my usage purposes I'd consider anything >> within a second to be pretty good service. I typically poll every 5 >> seconds for TAP jobs (should really back off after a while but I don't). >> >> On Thu, 21 May 2015, Patrick Dowler wrote: >> >>> >>> I have (crudely) implemented this and it should go live in a day or >>> so... will >>> announce it here when it is available. >>> >>> One thing that came to mind while implementing this is that it is >>> really quite >>> complex if you have load-balanced web servers. As such, I ended up >>> using >>> internal polling in case events don't get propagated(*) from the server >>> running the job to the server someone is blocking on... but I don't >>> know how >>> responsive the client wants. It seems it would be nice if the client >>> could say >>> "if I had to do it, I would poll every 300ms but I'm being nice and >>> WAIT=30 >>> (sec)"... The best solution is to have a distributed event mechanism >>> on the >>> server side but I can see that not having a great cost:benefit >>> ratio. Maybe if >>> the client could give a hint about responsiveness then server-side >>> polling >>> could be tuned sufficiently... WAIT=30&THREATEN-TO-POLL=100ms ? :-) >>> >>> * meaning that right now I don't do this at all :-) >>> >>> Pat >>> >>> >>> On 21/05/15 05:50 AM, Mark Taylor wrote: >>>> Hi grid, >>>> >>>> I'm interested in implementing a UWS-1.1-aware client in order to >>>> improve the way that I do blocking to wait for an asynchronous >>>> TAP response (sec 2.2.1.1). >>>> >>>> Are there some deployed TAP services out there that implement >>>> UWS-1.1 blocking behaviour? Preferably using the UWS 1.1 namespace? >>>> I'd like to test client behaviour against them. >>>> >>>> Thanks >>>> >>>> Mark >>>> >>>> On Tue, 19 May 2015, Paul Harrison wrote: >>>> >>>>> Hi, >>>>> >>>>> We are hoping to advance UWS 1.1 to PR status at (or perhaps >>>>> before) the >>>>> Sesto Interop Meeting. I believe that there are now several >>>>> implementations that are being updated to adhere to the new >>>>> version. It >>>>> has been brought to my attention that the UWS 1.1 schema was not >>>>> available >>>>> in the text of the last WD that was published to the IVOA document >>>>> store. >>>>> For those that need a copy, it is available in volute >>>>> >>>>> https://code.google.com/p/volute/source/browse/trunk/projects/grid/uws/doc/UWS.xsd >>>>> >>>>> >>>>> In addition I have made some small edits to the standard document >>>>> - in >>>>> preparation for PR - again this version is as yet only available >>>>> in volute >>>>> >>>>> http://volute.googlecode.com/svn/trunk/projects/grid/uws/doc/UWS.html >>>>> >>>>> I believe that there is only one outstanding issue in this >>>>> document in >>>>> section 2.2.1.1 >>>>> (http://volute.googlecode.com/svn/trunk/projects/grid/uws/doc/UWS.html#blocking) >>>>> >>>>> highlighted in yellow. It concerns how other standards might >>>>> signal which >>>>> version of UWS that they are using. My current feeling is to >>>>> remove this >>>>> text entirely and leave it up to the other standard documents to >>>>> signal >>>>> the mechanism that they want to use. I think that part of the >>>>> cause for >>>>> this versioning issue to come up was the lack of clarity of the >>>>> namespace >>>>> for the UWS schema - which is now addressed by the release of a >>>>> schema >>>>> with a new 1.1 namespace - therefore it is safe to remove this >>>>> issue from >>>>> the UWS specification. Any objections? >>>>> >>>>> As has recently been mentioned on the lists, we hope to have a >>>>> session in >>>>> Sesto on the topic of versioning of schema to form some guidelines >>>>> for >>>>> good practice. >>>>> >>>>> >>>>> Regards, >>>>> Paul. >>>>> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> Mark Taylor Astronomical Programmer Physics, Bristol >>>> University, UK >>>> m.b.taylor at bris.ac.uk +44-117-9288776 http://www.star.bris.ac.uk/~mbt/ >>>> >>> >>> -- >>> >>> Patrick Dowler >>> Canadian Astronomy Data Centre >>> National Research Council Canada >>> 5071 West Saanich Road >>> Victoria, BC V9E 2E7 >>> >>> 250-363-0044 (office) 250-363-0045 (fax) >>> >> >> -- >> Mark Taylor Astronomical Programmer Physics, Bristol University, UK >> m.b.taylor at bris.ac.uk +44-117-9288776 http://www.star.bris.ac.uk/~mbt/ >> . >> > From paul.harrison at manchester.ac.uk Tue May 26 11:38:57 2015 From: paul.harrison at manchester.ac.uk (Paul Harrison) Date: Tue, 26 May 2015 09:38:57 +0000 Subject: UWS 1.1 In-Reply-To: <556435CE.7060206@ari.uni-heidelberg.de> References: <3BCA40AA-0574-46EC-BDD7-5F7562118DF7@manchester.ac.uk> <555E3E99.4060603@nrc-cnrc.gc.ca> <555F9C51.7010507@nrc-cnrc.gc.ca> <556435CE.7060206@ari.uni-heidelberg.de> Message-ID: Hi UWS 1.1 implementors! > On 2015-05 -26, at 09:58, Gr?gory Mantelet wrote: > > Hello Pat, Mark and Grid, > > UWS 1.1 will be also implemented in the CDS/ARI's UWS Library. I have still few things to test or finish on it, but basically job list filtering and the blocking behaviour are working. Developments are still in progress but available on GitHub in the branch "uws1.1": https://github.com/gmantele/taplib/tree/uws1.1 > > @Mark: if you want, I can provide you a .war archive with a very simple uws that you could deploy locally in order to test Topcat on it. > > > About the blocking behaviour, I have two comments: > > First, the UWS Lib. is managing jobs completely in memory. This lets be notified immediately about a phase change thanks to a listener (already in place in the lib. for a while now). Thus, no polling is performed. > > Second, and it is the most important point for me: the client abortion. As Pat said, there is no way to detect client abortion (i.e. the client is not waiting any more for a server response)....at least using the Servlet API in Java. A such detection would be possible using directly sockets, but this part is hidden by the web application server (e.g. Tomcat). > It is now several days I am spending to find a solution, but the only work-around I found was to send after a while a redirection to the same request (with the ?WAIT parameter ; with a reduced waiting time if some was set). In this way, if the client gave up, the redirection is not done, and otherwise the server receives a new waiting request and executes it. In both cases, the original waiting request is finished on server side. > Well, this is quite fine if the client is a browser - it is totally transparent - but another client (like Topcat) should know about that trick. Honestly, I am not keen with this idea although it limits the resources consumption on server side by avoiding real unlimited wait but by doing like it does. The other solution, as Pat said, is to set a timeout (= max server waiting time). I would certainly do the same, except if the "request redirection" solution is fine for you. On of the central design features of UWS is that the server is allowed to make decisions that override the requests from the client in order to save resources - e.g. if the client asks for a destruction time 10 years in the future, the server is allowed to reply with 10 days (or whatever it thinks it can support as a maximum) see http://www.ivoa.net/documents/UWS/20101010/REC-UWS-1.0-20101010.html#DestructionTime A similar reasoning can be applied to WAIT times and indeed the 3rd paragraph of the http://volute.googlecode.com/svn/trunk/projects/grid/uws/doc/UWS.html#blocking section anticipates that the server might return earlier than the WAIT time if necessary and the client should be able to deal with this situation - i.e. read the returned PHASE, and not just assume that it has changed because the server has returned. As an aside (I am not suggesting this for UWS 1.1) - our services are starting to look rather old fashioned nowadays, as this sort of sophisticated ?two way? flow control in done with web-sockets nowadays - we should perhaps discuss this in Sesto too. > > By the way, I also want to briefly point out that the same problem (client abortion detection) exists also for TAP synchronous requests. > > > To conclude this email, I would like to know whether I have understood well about the ARCHIVED phase. A job is going in this phase ONLY IF the destruction time is reached. Is it right or is there a manual way for a user to put a job in this phase? > Then, I suppose that a normal ?action=delete request is enough to anyway really delete this job, isn't it ? > There is no direct manual way for the client to put a job into the ARCHIVED phase - it is again a sever decision (at least for UWS 1.1 for backwards compatibility) as to whether to implement an ARCHIVED phase. I think that it would be allowable for a server to decide to put a job into the ARCHIVED phase even with an explicit ?action=delete, Regards Paul. From patrick.dowler at nrc-cnrc.gc.ca Tue May 26 20:28:34 2015 From: patrick.dowler at nrc-cnrc.gc.ca (Patrick Dowler) Date: Tue, 26 May 2015 11:28:34 -0700 Subject: UWS 1.1 In-Reply-To: References: <3BCA40AA-0574-46EC-BDD7-5F7562118DF7@manchester.ac.uk> <555E3E99.4060603@nrc-cnrc.gc.ca> <555F9C51.7010507@nrc-cnrc.gc.ca> <556435CE.7060206@ari.uni-heidelberg.de> Message-ID: <5564BB52.1060302@nrc-cnrc.gc.ca> On 26/05/15 02:38 AM, Paul Harrison wrote: >> Well, this is quite fine if the client is a browser - it is totally transparent - but another client (like Topcat) should know about that trick. Honestly, I am not keen with this idea although it limits the resources consumption on server side by avoiding real unlimited wait but by doing like it does. The other solution, as Pat said, is to set a timeout (= max server waiting time). I would certainly do the same, except if the "request redirection" solution is fine for you. > On of the central design features of UWS is that the server is allowed to make decisions that override the requests from the client in order to save resources - e.g. if the client asks for a destruction time 10 years in the future, the server is allowed to reply with 10 days (or whatever it thinks it can support as a maximum) seehttp://www.ivoa.net/documents/UWS/20101010/REC-UWS-1.0-20101010.html#DestructionTime > > A similar reasoning can be applied to WAIT times and indeed the 3rd paragraph of thehttp://volute.googlecode.com/svn/trunk/projects/grid/uws/doc/UWS.html#blocking section anticipates that the server might return earlier than the WAIT time if necessary and the client should be able to deal with this situation - i.e. read the returned PHASE, and not just assume that it has changed because the server has returned. Agreed. Although I think the client should be expected to deal with the redirect correctly, they also have to be able to deal with a server that just lets the wait run until there is an http timeout and they have to deal with the server unblocking for a change they don't care about... in all of that, I find I prefer the max wait time on the server side as it is consistent with the rest of UWS and easiest to implement. But, I think one could provide the convenience of the redirect without violating the spec. Another small fiddly bit: I did implement ?WAIT (with no value) to wait until the max wait time, but I see no value in having WAIT=0 be special and do the same thing... with http timeouts there are no particularly useful wait times that are large enough that a client can't come up with a normal value that expresses their actual interest and no one is really interested in waiting forever :-) So, in my implementation WAIT=0 does exactly that: waits for 0 seconds then returns. -- Patrick Dowler Canadian Astronomy Data Centre National Research Council Canada 5071 West Saanich Road Victoria, BC V9E 2E7 250-363-0044 (office) 250-363-0045 (fax) From patrick.dowler at nrc-cnrc.gc.ca Thu May 28 00:07:12 2015 From: patrick.dowler at nrc-cnrc.gc.ca (Patrick Dowler) Date: Wed, 27 May 2015 15:07:12 -0700 Subject: UWS 1.1 In-Reply-To: <5564BB52.1060302@nrc-cnrc.gc.ca> References: <3BCA40AA-0574-46EC-BDD7-5F7562118DF7@manchester.ac.uk> <555E3E99.4060603@nrc-cnrc.gc.ca> <555F9C51.7010507@nrc-cnrc.gc.ca> <556435CE.7060206@ari.uni-heidelberg.de> <5564BB52.1060302@nrc-cnrc.gc.ca> Message-ID: <55664010.5060803@nrc-cnrc.gc.ca> Our TAP and VOSpace services now support blocking and job listing, more or less as specified in WD-UWS-1.1 except don't treat WAIT=0 as special (just wait for 0 :-) More details in my TAP-1.1 post to the DAL mailing list. -- Patrick Dowler Canadian Astronomy Data Centre National Research Council Canada 5071 West Saanich Road Victoria, BC V9E 2E7 250-363-0044 (office) 250-363-0045 (fax)