DALI comments

Tom McGlynn (NASA/GSFC Code 660.1) tom.mcglynn at nasa.gov
Mon Feb 29 21:46:31 CET 2016


While I'm often agnostic on the merits of most of Markus' recent 
comments on the DALI specification, they have at least prompted me to 
read the document myself.  I've got a number of editorial comments on 
the document.  Given the central role of this document for accessing all 
VO data, its clarity is particularly important. These comments refer to 
version 1.1 of the document.

[When I started this was just going to be two or three points, but it 
grew in the writing.  Hope this is helpful.  The points are in the 
document order pretty much.]

     Tom


1  .The introductory sentence is very nice: the document is going to 
describe resources, parameters and responses.  The next three sections 
deal with each of these in turn.   The rest of the intro is IVOA 
goobledygook but that's Ok.  This structure is simple and easy to 
follow.   I wish more of our standards has such a straightforward 
organization.

2. However I think that there is something missing in either this 
paragraph on in the beginning of each of the three following sections: a 
clear definition of what a resource, parameter or response are.

This is particularly acute for resource.    The intro in section 2 talks 
about REST and jobs, but this is jumping to the implementation before we 
set any context.  E.g., Section 2 might begin:

--
   DAL services are implemented using as a set of resources on the web.  
DAL services use HTTP protocols to support their communications and each 
service must implement multiple URL endpoints to access these 
resources.  This section describes the structure and relationships 
between the required and optional endpoints that are provided by a 
service.  E.g., all services must implement a URL which allows a user to 
ask if a service is available and may implement an endpoint that allows 
for asynchronous access. The conventions described below allow a user to 
infer the URL for each endpoint from some base location.
--

Note that this makes it clear that DAL is an HTTP based protocol and 
introduces URLs.  The current words are very nebulous.  If someone is 
not already clear what's going on, I don't think the words help.

3.  I think the discussion of the resources available should be cleaner 
if it really was limited to a discussion of the resources. E.g., the 
vast bulk of sections 2.n  is spent discussing the responses to 
invocations of resources.  This belongs in section 4.

4. I think there should be a very explicit definition of a job in terms 
of resources.  This is pretty subtle and should not be done en passant 
in the introductory paragraph of section 2.  E.g.,  I think something like:
--
   The invocation of some resources may define a job within that 
service.  Once a job is created new resources may be available to 
monitor or modify the actions of the service.  E.g., a service may allow 
users to create an asynchronous query as a job.  The response to the job 
creation resource will normally include an identifier for the job which 
is used in the specification of resources to monitor, cancel or get the 
results of the query.  The resources associated with a specific job will 
normally have the job identifier as part of the URL.  [I'd include a 
specific example of a job creation and then subsequent job resources 
that are available.]
--

Again the idea is to start with the three things we're going to talk 
about and build up from them.

5.  Keeping this restricted to the actual calls would more clearly 
expose the structure of DALI than the current words which hide the 
structure in a clutter of details of the responses.

6. The discussion of parameters has some issues too...  E.g., why do we 
start with what purports to define a DALI job when what we need to 
define is a parameter.  We need to define things in terms of the three 
elements we're talking about in this section: resources, parameters, and 
responses.  So I'd start with something like:

--
   A parameter is a key-value pair that is passed to a resource to 
control the response of the resource.    When a user creates a job, the 
parameters for the job may be specified in either the the initial 
job-creating resource invocation or in subsequent calls to job resources 
if this is supported by the DAL protocol for the service.
--

7. We should be specific about how we pass in parameters.  We are are 
assuming the standard CGI formats when we use the '=' notation latter 
on.  So just call it out.  [I don't know if there is a formal reference 
for this but it should be referenced if appropriate.]  If you don't
want to assume this structure for passing in parameters, at least note 
that this is one way we can send them in.

--
Most DAL service use standard web conventions for passing parameters.  
Parameters using the standard URL encoding use a
   key=value
syntax where the usual encoding rules for the key and value are observed 
when the value is used within a URL or a POST stream. Other encodings 
are possible and mutlipart-form encoding is mandatory in any resource 
invocation which involves a file upload. However in this this document 
we conventionally display the parameters using the unencoded key=value 
string even though other encodings may be supported.
--

8.  Section 3.3 does not belong in section 3.  It very explicitly notes 
that it discusses values in both parameters and responses.  If so it 
needs to go in a new section "5. Literal values".   Section 5 should 
then be explicitly referenced in sections 3 and 4.


9.  I think that you do readers a disservice by not being more explicit 
in defining how integers and real numbers are to be represented and 
using a obscure reference.  Not even a link in the version I'm seeing.  
Reading this I've not idea if octal or hex numbers are supported.  Is 
exponential notation?  Can I use e and E in the exponent.  At least give 
the use a taste of valid formats. As always examples are good!  Examples 
of what's not supported too.

10. Section 3.3.2 is self-contradictory.  First it states that all data 
and time values must be represented using the ISO-like FITS format.
Then later it has "where values may be expressed using Julian dates."   
But by the first sentence that is never.  I'm not quite sure what
is meant here.

11. In 3.3.2 I think it would be helpful to clarify if the boolean 
values are case sensitive.   E.g., can I use TRUE or just true?

12.  I can't follow what 3.4.2 is saying.

13. I find the discussion of VOTable encodings a bit out of place but 
I'm not sure why.  I wonder if this structure is causing some 
contortions.  If we separate the parameter encoding and the VOtable 
encodings, then we've no problem allowing a single values numeric 
parameter to be interpreted as a range.  I.e., user specifies
     band=1
which get's interpreted as band=1 1 so that the current discussion (in 
other DAL email) of array=2 versus array=2* is moot.

14.  Aaargh.  I do have a substantive comment.  The discussion of 3.4.4 
allows MAXREC to fail because there is no data matching the request (as 
I read it) or to succeed regardless.  I don't think this is right.  
Either it should be a way of getting the metadata regardless of whether 
there is matching data, or it should always check that there is matching 
data and use the overflow indicator to indicate whether any data was 
found (i.e., if there would be data, then the overlow is set to yes).

15.   The discussion in 3.4.5 is confusing.  We first say we would have 
this parameter
     UPLOAD=table3,param:t3
and this content where we then put multipart-form data in great detail.  
However the UPLOAD parameter will also have been
encoded these way, not as a simple key=value string.    So I'd add some 
caveat like:
--
  Note that in this case the UPLOAD parameter would also be encoded 
using the multipart/form-data encoding but we
have presented it as a simple key/value value pair.
--

... or just note that you're using multipart form endcoding and that 
given the UPLOAD parameter above the CGI parameter name
for the file upload should be t3.

16. The first sentence is section 4 jumps a little too quickly to 
implementation.  I'd suggest something like:

--
The output of the resource invocation is returned as a response using 
the HTTP protocol.  The response indicates the status of the request.  
It either  directly provides the requested information or directs the 
user on how and where to find the desired data.
E.g.,the response to an availability request will include whether the 
service is ready, while a request to initiate an asynchronous query 
typically returns a job ID that can be used to construct resource URLs 
to monitor the progress of the job and eventually retrieve the output.  
In HTTP terms, DAL service responses can be of three types..."
--

17.  I think much of the discussion in section 2 discussing the 
responses to requests belongs in section 4.

18.  I'd suggest the Content-type be mandatory.  Not sure I care about 
the others http headers.

19.  The discussion of OVERFLOW in makes handling it more complex since 
we have to handle two cases rather than
the one used in TAP.  Not sure how it helps.




More information about the dal mailing list