John S. Erickson, Ph.D. left a rather interesting question on my earlier post HTTP RFC paraphrased for the Web of Data, quoted here in full:
Here's a question: are GET and PUT symmetrical? Perhaps my question is a bit out-of-band, because the context of my question is really about the symmetry of content negotiation. Implicit in RFC 2616 section 14 is the existence of a (potentially) wide variety of alternate representations that may be accessed via the HTTP/1.1 request header fields; in the world of linked data these are useful for differentiating between
text/html,application/rdf+xml,text/rdf+n3, etc.So maybe the heart of my question is this: in a linked data world where conneg is alive and well (and assumed), what should be the correct interpretation of RFC 2616's (if) the Request-URI refers to an already existing resource, the enclosed entity SHOULD be considered as a modified version of the one residing on the origin server? If I can GET a resource to the resolution of a MIME type, should I not expect to PUT to that same resolution as well? Or, if during a GET the server responds with a list of
Alternates, shouldn't I expect to PUT using any of those choices?Readers familiar with MIME-typed disseminators in the digital repository world (think: Fedora) might think of this as a question about MIME-typed "ingestors"; it is a similar issue: instead of requesting a particular manifestation of the object, this is about updating a specific manifestation (possibly updating the whole).
It's a very good question, and I'm perhaps a bit out of my rank in answering, but to me it seems easy (perhaps too easy) to answer... here go's:
You can request a GET on anything with an HTTP URI.
HTTP URIs can identify anything, however for the context of this let's limit URIs to identifying three distinct things:
- Stored Hypermedia (digital documents, audio, video, text files ... any "file" physically stored on the server and web accessible.)
- Web Services (a data-accepting process, a gateway to some other protocol, a separate entity that accepts annotations ... Nota bene if you are still reading then you know what I'm referring to so no need to be pedantic on my choice of terminology)
- Things (the sky, dave's car, me ... and in this context not hypermedia or web application endpoints)
Let's look at what we can successfully PUT. RFC 2616 section 9.6 tells us:
The PUT method requests that the enclosed entity be stored under the supplied Request-URI ... the URI in a PUT request identifies the entity enclosed with the request ...
We can derive from the above that only "Stored Hypermedia" can successfully be PUT; and thus we cannot successfully PUT a "Web Service" or a "Thing".
Nota bene I'm assuming here that you (the reader) already knows all about CONNEG, Linked Data and the multiple questions which have arisen on how to handle content negotiation, especially with regards RDF and alternative representations.
In the Linked Data world we use the RDF data model extensively. We describe things with statements, each statement is a subject-object-predicate expression which we refer to as a triple; strap enough of these triples together and you have a description of a thing. This is our RDF.
To pass these descriptions around we can serialize our RDF statements in a variety of machine readable formats application/rdf+xml and text/rdf+n3 amongst others.
An RDF Document is where we persist one of these serializations in document, id est Stored Hypermedia.
So to store an RDF document we would logically use PUT, and then GET it later, or perhaps DELETE it. We certainly wouldn't POST to an RDF Document, anymore than we would POST to a GIF image! When thinking about an RDF Document think of it as Hypermedia, exempli gratia an AVI, JPEG or WAV, and treat it as such; a single document, in a single format.
RDF is not one of it's serializations, and it most certainly isn't an "RDF Document".
A major factor which compounds confusion in regards to content negotiation is the Web community's fixation of thinking about the response to a GET as if it were Stored Hypermedia, especially when the mime type associated with the returned entity is something like text/html or application/rdf+xml.
In common language what you are reading is a "web page", infering that it is a Document and Stored Hypermedia.
In reality what you are viewing is an entity returned by a GET request to a Web Service, and that entity has a mime type of text/html. What you are reading is actually just some data I input in to a database via a POST to a CMS (wordpress); there are other things here too, presentation data, navigation on the right and so forth. Leave a comment and the data I entered will stay the same but the entity returned when you next GET this URI will be different. This certainly is not Stored Hypermedia (as we defined it above).
Considering the conneg questions, ask yourself "are these Alternatives Stored Hypermedia or simply entities returned by a web service?". I'm willing to take a punt here and say that your answer will be the latter. Which means that you'd be getting the data in to your system via a POST to a web service. and not a PUT.
Further, if you think about the "RDF" you would no doublt be posting to create these alternatives, it may well make sense to think of it as POSTing RDF in a serialized form which the Web Service accepts. The Web Service will then (all going well) persist this RDF, or the triples, in something like a QuadStore. This serialized RDF is not an RDF Document, it's just serialized RDF in a request entity, just like any other data. Thus think of serialized RDF like json, base64 encoded strings, the egg in ?name=egg&val=shell; it's "just" data.
All of the above is driven home when you consider HTTP URIs that identify "Things". You can't send the Thing in an entity, thus if you get any response from a server when you HTTP GET the URI for a thing, then that URI must be serviced by a Web Service (as defined above).
And now that you know it's a Web Service logic would dictate that you should POST back to it...
The POST method is used to request that the origin server accept the entity enclosed in the request as a new subordinate of the resource identified by the Request-URI in the Request-Line. POST is designed to allow a uniform method to cover the following functions: - Annotation of existing resources; ...
Answering the questions!
After all of that, on to the questions.
- are GET and PUT symmetrical?
Only in the context where the URI requested identifies Stored Hypermedia. - If I can GET a resource to the resolution of a MIME type, should I not expect to PUT to that same resolution as well?
No as by this point you know that the URI/MIME combo you requested definately does not represent Stored Hypermedia (a Document). - Or, if during a GET the server responds with a list of Alternates, shouldn't I expect to PUT using any of those choices?
Not expect, as you have no way of knowing whether each URI returned identifies Stored Hypermedia, a Web Service or another Thing. With every new URI the cycle starts again.
Please don't take any of this as gospel, it's all IMO. I do at least hope it makes sense though, and if it doesn't please do correct me asap :)
















Nathan:
HTTP allows clients to share data representations. These can be data from a server or data from a client. Clients can request data representations (GET) and can send data representations (POST, PUT). They can also make requests for servers to DELETE something.
The representation format for read operations (GET) and write operations (POST, PUT) need not (often are not) symmetrical. Representations sent by clients need not contain hypermedia links to be PUT-able.
You are correct in identifying that the semantics of POST indicate a "subordinate" aspect where PUT does not. This is an important aspect and has nothing to do with representation formats.
Mike:
I've completely used the wrong term here (Stored Hypermedia) in my naivety haven't I?! The second I read your doc on Hypermedia Types I realised.
What I'm trying to refer to by Hypermedia in this case is just a "file", but the term Document implies different things to readers, obviously stored hypermedia is grossly wrong, can you suggest a better (correct and unambiguous) term for it??
Thanks :)
Thanks for taking a stab at my question, Nathan! I dope-slapped myself after asking the question as I remembered that the Atom Publishing Protocol (APP) clarifies both the PUT and POST issues.
See RDF 5023, esp. Section 5: Protocol Operations and Section 9: Creating and Editing Resources. In the second case, a full example of a POST is given, showing the proper specification of the content-type.
Curious if, and how, the clarification that APP provides might impact your thinking?
http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-08#section-7.6
proposed update to 2616 in httpbis committee.
Personally, I think that using content negotiation as a way of getting both a document and its metadata is a bad idea, as it ties down the abstract notion of associating access to a resource and access to "information about the resource" too tightly to the HTTP protocol.
I wish the "Semantic Web" wouldn't tie itself to HTTP; e.g., peer-to-Peer distribution doesn't have content negotiation.
Larry
Nathan:
To the HTTP app-protocol, the URI identifies a "Resource." In your post here you categorize these resources into three types (hypermedia/files, services, things).
Seems that what you are really saying is that some resources are WRITE-able (your file type), others are not (service, thing). While that may be true, there is nothing in the HTTP protocol that addresses this and, to my knowledge nothing in common markup formats (XHTML, XML, HTML, RDF, etc.) that make that distinction, either.
Thanks for your comments,
Mike: fully agree and yes "there is nothing in the HTTP protocol that addresses this" for a good reason I feel. Also noted that a URI is just a URI, any assuming what is at the end of the URI (should you look it up / request it) is frankly wrong.
Larry: fully agree, the semantic web shouldn't tie itself to HTTP. As far as I'm concerned all that matters is that we have triples in the form of ( identifier , identifier , identifier ) where each identifier can be looked up in some manner for more information about the thing it identifies.
John:
As for APP, it's fantastic and a great example of the way things should be done, but it doesn't really touch on what we're discussing above as far as I can see. To create resources you POST to the URI of a Collection and (on success) get back a URI for the Member that was created, from this point on you PUT/DELETE to the URI of a Member, it doesn't say anything (afaik?) about sending a PUT to an undefined URI. Nor does it say anything about content negotiation (on the Accept header). And just to re-afirm what I personally think, this is the way things should be done, the Atom Publishing Protocol is clear on how you handle publishing Atom, everything else is out of scope.
Everyone:
Basically I want to defer this back to all of you, my peers, I've only got 10 years behind me and not all at this level of discussion.. so here are two statements which I'd like you to respond to in some way, even if its just a "wrong" or "sounds good"
Statement One: the link between an RDF description and the thing it's descibing should always be a one way link, from RDF to thing. For Linked Data to work this must *always* be true, it's no good an HTML document linking to it's description if "London" or "Bobs Car" or even a GIF image can't.
Statement Two: With regards HTTP Linked Data, content negotiation (on the accept header) should only be used to accept the same RDF serialized as different mime/types.
Please do rip those statements to shreds; I'm very keen to have clarity on these matters.
note: all of these posts are just me learning and thinking out loud so I can get some feedback from the community as to where I'm going wrong, thus, thanks for the feedback :) it's very appreciated.
Nathan:
"...content negotiation (on the accept header) should only be used to accept the same RDF serialized as different mime/types."
This makes sense when the representation formats are very similar, but may not make sense for widely varying representation formats. For example, when negotiating for a graph image representation (image/png) of the same resource delivered via a text-based representation format (application/atom+xml) it is likely that not all the same data elements will be returned, but the same _meaning_ will be conveyed.
It's a subtle difference, but one that relieves developers from trying to figure out ways to send _all_ the same data in _all_ possible formats and, conversely, dismissing some representation formats as inappropriate if _all_ the data cannot be represented.
Let the "shredding" begin! ;)
Seriously, it's great for you to stand up and make assertions like these; it's helpful to everyone!
RE Statement 1: I think you are saying that we should not rely upon, and cannot count on, "things" making assertions about themselves. The bit I might challenge you on is the relationship of various representations to the "thing" and to their "siblings."
Let me explain: I believe that it is legitimate to have a HTTP URI for a thing, and for their not to be any representations of that "resource" itself. I further believe it is legitimate to have multiple HTTP URI-named resources "related" to that thing; these can be widely varied in nature, from
text/htmltext/pdftoapplication/rdf+xmltofoo/barumphand the only requirement should be that they are all "plausibly" related. By "plausibly," I simply mean that there exists some predicate; it could befoo:anythingButRelated.I think this gets to your Statement 2: conneg should be used for whatever conneg can be used for. But yes, it is expected and SHOULD be the case that any representation of RDF returned (a) should refer to the same thing and (b) should be logically consistent, or "the same RDF" as you say. Similarly, one would expect text/html and text/pdf representations to be consistent. But it is important to note that these are semantics imposed by a particular application space and not by the HTTP 1.1 specification; providers are free to do what they want, as long as they don't break the protocol. They might soon lose adopters, however...
The caveat in the paragraph above is that HTTP is stateless. One way to interpret this is, technically it is not an error if I PUT an update for a URI, immediately GET it (same content-type) and have it be different. Also, technically, it is not an error for
Finally, I respectfully disagree with what you've said about APP; I do think it applies here. I think APP is a great example of a symmetrical application of GET and PUT for given a specific content-type. No, APP doesn't say anything specifically about conneg, but that is implied by the explicit inclusion of content-types in the request headers. But I might be stretching it a bit, suggesting (beyond APP) that simply because I can GET from (e.g.) a set of
alternativetypes, that I should expect to PUT the same variety.I'm not sure about your point regarding PUTs to undefined URIs; as the spec says, GETs and PUTs (and DELETES) can only be requested for defined URIs, whereas POSTs are about creating resources. Perhaps I'm missing your point there.