<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>webr3.org &#187; linked data</title>
	<atom:link href="http://webr3.org/blog/category/linked-data/feed/" rel="self" type="application/rss+xml" />
	<link>http://webr3.org/blog</link>
	<description>brain&#039;s on fire!</description>
	<lastBuildDate>Tue, 19 Jul 2011 15:38:29 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>The simplest view possible of httpRange-14.</title>
		<link>http://webr3.org/blog/linked-data/the-simplest-view-possible-of-httprange-14/</link>
		<comments>http://webr3.org/blog/linked-data/the-simplest-view-possible-of-httprange-14/#comments</comments>
		<pubDate>Thu, 03 Mar 2011 20:46:19 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[linked data]]></category>
		<category><![CDATA[Uniform Resource Identifier]]></category>

		<guid isPermaLink="false">http://webr3.org/blog/?p=480</guid>
		<description><![CDATA[Here's an even simpler way of looking at this..

a URI is associated with a thing by a group of agents/people as a name for that thing.
some URIs are also associated with a set of representations over time by the dereferencing process.

Why were the representations made available for that URI?

because somebody made a web page and [...]]]></description>
			<content:encoded><![CDATA[<p>Here's an even simpler way of looking at this..</p>
<ul>
<li>a URI is associated with a thing by a group of agents/people as a name for that thing.</li>
<li>some URIs are also associated with a set of representations over time by the dereferencing process.</li>
</ul>
<p>Why were the representations made available for that URI?</p>
<ul>
<li>because somebody made a web page <em>and then</em> needed a uri to refer to it.</li>
<li>because somebody named something with a uri <em>and then</em> wanted to provide information about it.</li>
<li>because somebody made a web page about one specific thing <em>and then</em> needed a uri to refer to it <em>and then</em> the uri became commonly used to refer to the thing named.</li>
</ul>
<p>That's a really minimal set of the different ways of looking at it,  without getting in to any technical details at all, all three are really  common cases of how people use URIs, the in-fighting is just people  who've picked one of the three as being gospel, or technically required  to make things work.</p>
<p>It's a social problem.</p>
<p>The httpRange-14 resolution picked the first of the above reasons as being the norm, and as requiring the least technical trade-offs. The resolution also accounted for the second case, with precedence given to the importance of having distinct names, rather than network performance or ease of implementation, again simply a design trade-off, one which prioritizes humans over machines.</p>
<p>I definitely cannot explain it any simpler than that.</p>
]]></content:encoded>
			<wfw:commentRss>http://webr3.org/blog/linked-data/the-simplest-view-possible-of-httprange-14/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Opening Linked Data</title>
		<link>http://webr3.org/blog/linked-data/opening-linked-data/</link>
		<comments>http://webr3.org/blog/linked-data/opening-linked-data/#comments</comments>
		<pubDate>Fri, 26 Nov 2010 11:51:29 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[linked data]]></category>
		<category><![CDATA[API]]></category>
		<category><![CDATA[Blank node]]></category>
		<category><![CDATA[Computer file formats]]></category>
		<category><![CDATA[Computing]]></category>
		<category><![CDATA[Data management]]></category>
		<category><![CDATA[dom]]></category>
		<category><![CDATA[FOAF]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[JavaScript]]></category>
		<category><![CDATA[JSON]]></category>
		<category><![CDATA[King]]></category>
		<category><![CDATA[Knowledge representation]]></category>
		<category><![CDATA[Markup languages]]></category>
		<category><![CDATA[Metadata]]></category>
		<category><![CDATA[Open Data]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Procedural programming languages]]></category>
		<category><![CDATA[rdf]]></category>
		<category><![CDATA[RDF Schema]]></category>
		<category><![CDATA[RDF/XML]]></category>
		<category><![CDATA[RDFa]]></category>
		<category><![CDATA[registered media]]></category>
		<category><![CDATA[Resource Description Framework]]></category>
		<category><![CDATA[sem web community]]></category>
		<category><![CDATA[Semantic HTML]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[Technology/Internet]]></category>
		<category><![CDATA[the king]]></category>
		<category><![CDATA[Turtle]]></category>
		<category><![CDATA[Twitter Inc]]></category>
		<category><![CDATA[typical web developer]]></category>
		<category><![CDATA[Uniform Resource Identifier]]></category>
		<category><![CDATA[URI scheme]]></category>
		<category><![CDATA[web developer]]></category>
		<category><![CDATA[Web Developers]]></category>
		<category><![CDATA[Web services]]></category>
		<category><![CDATA[web standard format]]></category>
		<category><![CDATA[Web standards]]></category>
		<category><![CDATA[World Wide Web]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false">http://webr3.org/blog/?p=390</guid>
		<description><![CDATA[Linked Data has done fantastically well so far, but, compared to how well it could be doing, given the calibre and amount of data that's been opened up, it's not doing too well at all.
Why? well the sem web community is packed full of the most technically skilled and decent people I've come across so [...]]]></description>
			<content:encoded><![CDATA[<p>Linked Data has done fantastically well so far, but, compared to how well it could be doing, given the calibre and amount of data that's been opened up, it's not doing too well at all.</p>
<p>Why? well the sem web community is packed full of the most technically skilled and decent people I've come across so far, so it can't be that, the tooling is pretty damn good, there's loads of data, most of the data's of a high quality, wanted by developers, and certainly more than usable. The concepts, theory and technical aspects are all solid as a rock. In short it's all good, apart from one rather important detail.</p>
<p>Our Linked Open Data isn't really open data, not in the eyes of the common web developer at least. To most web developers on the planet, open data is something they can get access to and use easily.</p>
<p>A good example of 'open data' for most developers, is the Twitter API.</p>
<p>Here's how a developer accesses it in PHP:</p>
<pre><code>  $uri = 'http://api.twitter.com/1/statuses/show/7907258268647424.json';
  $tweet = json_decode(file_get_contents($uri));
  echo $tweet->user->description;
</code></pre>
<p>Here's how they access it in Javascript:</p>
<pre><code>  uri = 'http://api.twitter.com/1/statuses/show/7907258268647424.json';
  $.getJSON(uri, function(tweet) {
    write( tweet.user.description );
  });
</code></pre>
<p>The reality of the matter is that you can't do this with Linked Open Data, and that's because you can't do it with RDF - and really, honestly, if it's not that simple, the masses won't use it, because, if it's not that simple, they <i>can't</i> use it.</p>
<h3>The problems with the RDF formats</h3>
<p>They're not perfect, and they are a very mixed bunch, for a change, let's look at the negatives.</p>
<p><strong>RDF/XML</strong><br />
 - Requires full XML tooling<br />
 - Can't read or write by hand<br />
 - butt ugly</p>
<p>Let's be honest, unless you have a full XML and RDF stack and you know what you're doing, RDF/XML is simply a no go zone.</p>
<p><strong>RDFa</strong><br />
 - Requires HTML/DOM/XML tooling<br />
 - An extension to a markup language, designed to augment annotated documents.</p>
<p>RDFa is great, but not for the general <i>data</i> use case, it's not a simple data interchange format like JSON, and you can't publish or consume it without specialist tooling, in fact it requires an even bigger, more complicated, stack than RDF/XML.</p>
<p><strong>Turtle</strong><br />
 - Requires a custom parser<br />
 - It's not <i>yet</i> seen as a data format by the masses.<br />
 - Doesn't have a registered media type.</p>
<p>If we're all honest, Turtle is the king of RDF serializations, it's small, powerful, easy to write and read, requires minimal tooling to parse. In fact, I'd quite happily say that Turtle is the best all round data format, period.</p>
<p><strong>RDF/JSON and JSON-LD</strong><br />
 - Square peg, round hole</p>
<p>You can't shoe horn RDF in to JSON, no matter how hard you try - well, you <i>can</i>, but you loose all the benefits of JSON in the first place, because the data is RDF, triples and not objects, rdf nodes and not simple values - I fear I'd better explain, quickly:</p>
<p>The benefit of JSON is that you can do the following:</p>
<pre><code>  var u = tweet.user;                                // nested, simple objects
  write(u.message);                                  // simply "a string of text"
  if(u.geo_enabled) {                                // a boolean true
    var d = u.statuses_count * u.favourites_count;   // numbers..
  }
</code></pre>
<p>Anything more complicated than that and you've lost 95% of the benefits of JSON.</p>
<p>Here's the code to do that <code>if(u.geo_enabled)</code> with JSON/RDF:</p>
<pre><code>  var tweet = rdf['http://example.org/tweet/12343'];
  var user = rdf[tweet['http://example.org/property/userid']];
  var geoenabled = user['http://example.org/property/geo_enabled'];
  if( Boolean(geoenabled.value) ) {
</code></pre>
<p>and JSON-LD:</p>
<pre><code>  var tweet, user;
  for(o in jsonld) {
    if(jsonld[o]["@"] == 'http://example.org/tweet/12343') tweet = jsonld[o];
  }
  for(o in jsonld) {
    if(jsonld[o]["@"] == tweet['twit:userid']) user = jsonld[o];
  }
  if(user['twit:geo_enabled']) {
</code></pre>
<p>Remember, those two examples are only for the simple if line from the benefits of JSON example, can you even imagine all four lines?</p>
<p>Clearly, RDF in JSON is of little to no use to anybody, you can see plainly yourself, 95% of the benefits are lost and it's just another RDF serialization that's pretty much unusable without tooling. The only benefit JSON serializations of RDF have, are that you don't require an XML stack, which is quite a large benefit tbh.</p>
<h3>The problem with RDF</h3>
<p>I've been a little unfair there, you see the problem isn't with the serializations, we can't make a "better" serialization of RDF for general web developers, because the <i>real problem</i> is that the data's RDF, it's triples not simple objects, URIs rather than simple terms, RDF Nodes with a language or type, and not just simple values.</p>
<p>An array of RDF triples, or a structure of RDF, just simply isn't usable in most (all?) programming languages, by a typical web developer, without specialist tooling and libraries or APIs.</p>
<p>Am I saying RDF is bad? no of course not, it's awesomely brilliant in every way, it powers a paradigm shift and will have huge positive effects on the web and the human race. You know the score :)</p>
<p>What I am saying, is that we're not backwards compatible, we're not making our data open in formats which are usable by normal developers, developers who need and want the data, want the links, but not the semwebbery. Hell even most of us who are heavy sem web users only consider the ontology+reasoning side enabled by properties-with-uris some of the time, most of the time &lt;http://xmlns.com/foaf/0.1/name> might as well just be "name".</p>
<h3>Opening Linked Data</h3>
<p>So, here's what we need to do, we need to just accept that although we publish linked data as RDF, we also need to publish the data as simple objects so the world can use the data.</p>
<p>Given the linked data:</p>
<pre><code>  @prefix : &lt;http://webr3.org/nathan#> .
  @prefix foaf: &lt;http://xmlns.com/foaf/0.1/> .
  @prefix rdfs: &lt;http://www.w3.org/2000/01/rdf-schema#> .

  :me a foaf:Person;
    foaf:age 29;
    foaf:holdsAccount [ foaf:accountName "webr3";
        foaf:homepage &lt;http://twitter.com/webr3>;
        rdfs:label "Nathan's twitter account"@en ];
    foaf:homepage &lt;http://webr3.org>;
    foaf:knows &lt;http://example.com/bob#me>;
    foaf:name "Nathan";
    foaf:nick "webr3", "nath" .

  &lt;http://example.com/bob#me> a foaf:Person;
    foaf:name "Bob" .</pre>
<p></code></p>
<p>We <strong>also</strong> need to publish it like this:</p>
<pre><code>{
  "http://webr3.org/nathan#me": {
    "a": "http://xmlns.com/foaf/0.1/Person",
    "age": 29,
    "holdsAccount": {
      "accountName": "webr3",
      "homepage": "http://twitter.com/webr3",
      "label": "Nathan's twitter account"
    },
    "homepage": "http://webr3.org",
    "knows": "http://example.com/bob#me",
    "name": "Nathan",
    "nick": [ "webr3", "nath" ]
  },
  "http://example.com/bob#me": {
    "a": "http://xmlns.com/foaf/0.1/Person",
    "name": "Bob"
  }
}</code></pre>
<p>and now one can do: <code>me.holdsAccount.label</code> and get back a string, in any language.</p>
<p><strong>What have we lost?</strong><br />
Well nothing in the grand scheme of things because we're still publishing the RDF in other formats via conneg, however in this specific serialization of the data: we've lost the properties, the .language and the .type, although basic types are still there, dates are detectable, numbers are supported, booleans are supported, everything else is just a PlainLiteral, a string.</p>
<p><strong>What's still there?</strong><br />
The data, <code>http</code> names for things, follow your nose, rdf types ++usable-accessible-data in a web standard format. It's 3.5 if not 4.5 star data!</p>
<p><strong>Other considerations</strong><br />
Probably be wise to allow direct access to a .json URI too, so people can simple-GET the data (in addition to exposing via conneg).</p>
<p>Should be an easy hit, any good reasons why not?</p>
<h3>The other RDF Serializations</h3>
<p>While we're here, there are a few other things that need tidied up, properly.</p>
<p><strong>Turtle</strong><br />
Let's make it <i>the standard</i> RDF serialization, with a proper registered media types of application/turtle and text/turtle, fix any bugs in the spec (if any), and possibly allow an optional comma in those lists (1,2,3).</p>
<p>Let's just accept that RDFa is great, but sometimes you just want to embed a chunk of RDF in an HTML document, you know we all want to on occasion, people are shouting for it, it's easy to do, to deploy, doesn't break anything, and it's a really easy hit - so, in the Turtle standard spec pop a note that shows how to include it in an HTML document.</p>
<pre><code>&lt;script type="text/turtle">
  // turtle in here, as-is, no special encoding
&lt;/script></code></pre>
<p><strong>RDF/XML</strong><br />
Leave it as is, let it run it's course naturally, and in many respects just forget it moving forwards, everybody supports it already, that won't change, but there's approximately zero need to keep pushing it.</p>
<p><strong>RDFa</strong><br />
As awesome as it is, just see it for what it is, RDF in attributes, it's like the missing markup features of HTML, for annotating documents and describing the things described in the documents with annotations. It's not a simple data or RDF format, and it's not <i>really</i> suited to just dropping chunks of machine readable RDF in to an HTML document, Turtle in HTML will do that far better.</p>
<p><strong>RDF/JSON and JSON-LD</strong><br />
Standardize a JSON serialization as a replacement / alternative to XML - but, admit and stipulate before hand that it's not usable as-is, or really writable and is arguably human readable. It simply needs to be a fast, unambiguous, optimized for the machine, RDF serialization - no bells and whistles for humans, no "12^^xsd:type" in a string - just something that you can JSON.parse and run circa 10-20 standardized lines of code over to get back an RDFGraph of RDFTriples.</p>
<p>Fin &#038; end.</p>
]]></content:encoded>
			<wfw:commentRss>http://webr3.org/blog/linked-data/opening-linked-data/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
		<item>
		<title>Design Issue Updates</title>
		<link>http://webr3.org/blog/linked-data/design-issue-updates/</link>
		<comments>http://webr3.org/blog/linked-data/design-issue-updates/#comments</comments>
		<pubDate>Fri, 18 Jun 2010 15:58:44 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[linked data]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[social web]]></category>
		<category><![CDATA[Technology/Internet]]></category>
		<category><![CDATA[Tim Berners-Lee]]></category>
		<category><![CDATA[World Wide Web]]></category>

		<guid isPermaLink="false">http://webr3.org/blog/?p=369</guid>
		<description><![CDATA[Just a quick note to let you all know that some of the crucial design issues related to social web, cloud storage, linked data, read write web of data and related have been updated by Tim Berners-Lee.
The specific issues are:

Read-Write Linked Data
Socially Aware Cloud Storage
Levels of Abstraction: Net, Web, Graph

I'm yet to disseminate all that's [...]]]></description>
			<content:encoded><![CDATA[<p>Just a quick note to let you all know that some of the crucial design issues related to social web, cloud storage, linked data, read write web of data and related have been updated by Tim Berners-Lee.</p>
<p>The specific issues are:</p>
<ul>
<li><a href="http://www.w3.org/DesignIssues/ReadWriteLinkedData.html">Read-Write Linked Data</a></li>
<li><a href="http://www.w3.org/DesignIssues/CloudStorage.html">Socially Aware Cloud Storage</a></li>
<li><a href="http://www.w3.org/DesignIssues/Abstractions.html">Levels of Abstraction: Net, Web, Graph</a></li>
</ul>
<p>I'm yet to disseminate all that's changed, but they certainly are filled out and refined, remember folks the devils in the details!</p>
<p>Quite sure that I'll follow up with a bunch of notes, as will a few others - but for now, there's the heads up that it's time to do a bit of reading.</p>
]]></content:encoded>
			<wfw:commentRss>http://webr3.org/blog/linked-data/design-issue-updates/feed/</wfw:commentRss>
		<slash:comments>21</slash:comments>
		</item>
		<item>
		<title>Something&#039;s missing in the Web of Linked Data</title>
		<link>http://webr3.org/blog/linked-data/somethings-missing-in-the-web-of-linked-data/</link>
		<comments>http://webr3.org/blog/linked-data/somethings-missing-in-the-web-of-linked-data/#comments</comments>
		<pubDate>Sat, 12 Jun 2010 15:34:59 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[linked data]]></category>

		<guid isPermaLink="false">http://webr3.org/blog/?p=360</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p><a href="http://webr3.org/blog/wp-content/uploads/2010/06/elephant.jpg"><img src="http://webr3.org/blog/wp-content/uploads/2010/06/elephant.jpg" alt="" title="elephant" width="738" height="2230" class="alignnone size-full wp-image-359" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://webr3.org/blog/linked-data/somethings-missing-in-the-web-of-linked-data/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Maybe we don&#039;t need Named Graphs</title>
		<link>http://webr3.org/blog/semantic-web/maybe-we-dont-need-named-graphs/</link>
		<comments>http://webr3.org/blog/semantic-web/maybe-we-dont-need-named-graphs/#comments</comments>
		<pubDate>Sun, 16 May 2010 15:48:27 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[linked data]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[ACL]]></category>
		<category><![CDATA[ACL processor]]></category>
		<category><![CDATA[Computing]]></category>
		<category><![CDATA[Graph]]></category>
		<category><![CDATA[Graph theory]]></category>
		<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[Online social networking]]></category>
		<category><![CDATA[rdf]]></category>
		<category><![CDATA[RDFLib]]></category>
		<category><![CDATA[Reference]]></category>
		<category><![CDATA[Resource]]></category>
		<category><![CDATA[Resource Description Framework]]></category>
		<category><![CDATA[Semantically-Interlinked Online Communities]]></category>
		<category><![CDATA[SPARQL]]></category>
		<category><![CDATA[Tim Berners-Lee]]></category>
		<category><![CDATA[Uniform Resource Identifier]]></category>
		<category><![CDATA[web server]]></category>
		<category><![CDATA[web server administrators]]></category>
		<category><![CDATA[web servers]]></category>
		<category><![CDATA[Web services]]></category>
		<category><![CDATA[Web standards]]></category>
		<category><![CDATA[World Wide Web]]></category>

		<guid isPermaLink="false">http://webr3.org/blog/?p=332</guid>
		<description><![CDATA[In this post I'll put forward an argument that perhaps the "web of linked data", and thus RDF(2)/OWL(2), doesn't need any concept of Named Graphs.
This is quite a dry subject, and I could be wrong (in fact in some ways I want to be proved wrong, this is how we learn), but do read on [...]]]></description>
			<content:encoded><![CDATA[<p>In this post I'll put forward an argument that perhaps the "web of linked data", and thus RDF(2)/OWL(2), doesn't need any concept of Named Graphs.</p>
<p>This is quite a dry subject, and I could be wrong (in fact in some ways I want to be proved wrong, this is how we learn), but do read on if you're interested.</p>
<h3>Example</h3>
<p>Over the past few months I've hit on a number of occasions where I was convinced I needed Named Graphs in order to address the task at hand.</p>
<p>A notable example is the scenario where using WebAccessControl and the ACL ontology, a system would have to figure out just who should be given access to a resource, and who should be denied.</p>
<p>In this example I'll cover the notion of ACL for "groups" in a linked data world.</p>
<p>The task at hand is to allow access if:<br />
<code>the graph serialized within the document obtained by dereferencing the URI of the group states the &lt;webid#me> is a member.</code></p>
<p>Otherwise written as:<br />
<code>if we dereference &lt;groups#admin> does the graph returned include the following { &lt;groups#admin> sioc:has_member &lt;webid#me> }</code></p>
<p>Or in SPARQL:</p>
<pre>ASK
GRAPH &lt;groups> {
  &lt;groups#admin> sioc:has_member &lt;webid#me>
}</pre>
<p>In this example we *do not* want to dereference the users webid to see if the graph returned specifies that { &lt;webid#me> sioc:member_of &lt;groups#admin> } , or indeed consider the open world possibilities that another yet unknown graph could assert that the user is a member of our admin group, as that would breach security.</p>
<h4>The ACL</h4>
<p>To proceed with the example, consider the following ACL:</p>
<pre>[] a acl:Authorization ;
	acl:accessTo &lt;https://example.org/sensitive> ;
 	acl:agentClass :mygroup ;
 	acl:mode acl:Read .

:mygroup owl:equivalentClass [
 	a owl:Restriction ;
 	owl:hasValue &lt;groups#admin> ;
 	owl:onProperty [ owl:inverseOf sioc:has_member ];
 	] .
</pre>
<h4>The Problem</h4>
<p>The problem proposed by this ACL is that any of the following four sets of triples would infer that &lt;webid#me> would qualify as an instance of :mygroup (or a member of &lt;groups#admin> if you prefer).</p>
<ul>
<li>
<pre>&lt;webid#me> sioc:member_of &lt;groups#admin> .</pre>
<li>
<pre>&lt;webid#me> _:x &lt;groups#admin> .
_:x owl:inverseOf sioc:has_member .</pre>
</li>
<li>
<pre>&lt;groups#admin> sioc:has_member &lt;webid#me> .</pre>
</li>
<li>
<pre>&lt;groups#admin> _:y &lt;webid#me> .
_:y owl:inverseOf sioc:member_of .</pre>
</li>
</ul>
<p>In other words, the ACL does not specify a "Named Graph" to query, and at the moment, no way exists to specify with (OWL or RDF) which "Named Graph" to query / trust.</p>
<p>This, point in case, is one example where I saw the need for Named Graphs in RDF and OWL.</p>
<h4>Another way of looking at it</h4>
<p>You will have noticed the notion of "Named Graphs" creeping in above, seems like a logical thing to say, especially when you consider that to process this ACL and grant access you'd probably use SPARQL, and specify a Named Graph to query over. However, much of what follows arose because I'd decided not to use SPARQL, and rather to code an ACL processor in my preferred language.</p>
<p>If you consider the situation, the ACL processor which decides if access should be granted or not, must implicitly "trust" the document which contains the serialized ACL graph. That is to say, that it must by extension trust any resources pointed to by said ACL, and if it doesn't then the ACL isn't fit for the purpose.</p>
<p>It's also important to note that "trust" is context specific, in this case we trust the resources pointed to by the ACL for the purpose of WebAccessControl.</p>
<p>One could then pretty quickly conclude that in this scenario the ACL processor already know's how to process the ACL, it must only use resources it trusts, therefore it must only  allow access if <code>the graph serialized within the document obtained by dereferencing the URI of the group states the &lt;webid#me> is a member.</code> </p>
<p>(because &lt;groups#admin> is specified in the ACL, and thus by extension, trusted)</p>
<h4>Named Graphs in SPARQL</h4>
<p>The aforementioned logic would also apply if I was using SPARQL to process the ACL, it would equate to the ACL processor asking:</p>
<pre>ASK
GRAPH &lt;groups> {
  &lt;groups#admin> sioc:has_member &lt;webid#me>
}</pre>
<p>But again this is very context specific to the example, let's consider for a moment that the URI for the group could have been a non-fragment URI, &lt;groups/admin> for example.</p>
<p>This leads us to an important problem, when we dereference &lt;groups/admin> it would have to 303 See Other through to a different URI, let's say &lt;data/groups/admin> - which would then mean that the Named Graph to be used was &lt;data/groups/admin> - this URI, you may note, we do not know when we are writing our ACL; so if we ASKed the above SPARQL, the results would always come back negative, since their is no GRAPH &lt;groups>.</p>
<p>The URI of the Named Graph issue is compounded by modern web servers and publishing practises, because &lt;data/groups/admin> could easily be content negotiated (or rewritten), thus giving various final URI's of &lt;data/groups/admin> or &lt;data/groups/admin.rdf> or &lt;data/groups/admin.ttl> or &lt;data/groups/admin.n3> and so forth. One could quite easily (and often does) end up with the same Graph repeated multiple times within a quad store, all under "different" "Named Graphs".</p>
<p>I'll expand on a possible way of addressing this problem further on.</p>
<h4>Directionality</h4>
<p>Previously I mentioned that the ACL processor didn't have a problem with the above ACL, because it by nature trusted all resources which were mentioned in the ACL graph. However, again this is very context specific.</p>
<p>Let's consider for a moment an inverted ACL, where we want to allow access if:<br />
<code>the graph serialized within the document obtained by dereferencing the URI of the users <strong>webid</strong> states that &lt;webid#me> is a sioc:member_of &lt;groups#admin>.</code></p>
<p>We don't know the users webid ahead of time when we write the ACL, so again we have no way of writing how to trust a resource - it is critical to note that even if RDF(2) did support the concept of Named Graphs, it still wouldn't address the situation because we wouldn't know the Named Graph ahead of time, in order to trust it!</p>
<p>If we now consider the following ACL:</p>
<pre>[] a acl:Authorization ;
	acl:accessTo &lt;https://example.org/sensitive> ;
 	acl:agentClass :mygroup ;
 	acl:mode acl:Read .

:mygroup owl:equivalentClass [
 	a owl:Restriction ;
 	owl:hasValue &lt;groups#admin> ;
 	owl:onProperty sioc:member_of;
 	] .
</pre>
<p>The outcome of our previous logic concludes that again we should be querying the "trusted" resource &lt;groups#admin>, which gives us another problem, that's not the resource we want to be asking in this scenario.</p>
<p>The only thing that remains, and I'll later argue the only thing that ever matters in a web of linked data, is direction.</p>
<p>If we analyse the first ACL closer, we can see that we ultimately used the direction inferred by the presence of owl:inverseOf to place &lt;groups#admin> in the subject position, rather than the value/object position it could have been in, indicated by the presence of owl:hasValue. (bare with me).</p>
<p>In this example, we can use the strong semantics of owl:hasValue (and lack of owl:inverseOf) to place &lt;groups#admin> in the value/object position, and thus our ACL processor can come to the outcome we want, which is to look for the a triple with the meaning { &lt;webid#access> sioc:member_of &lt;groups#admin> }, and that means dereferencing the URI in the subject position, in other words asking the graph serialized in the document returned by GETting &lt;webid> if it contains such a triple.</p>
<p>I've applied some understanding to OWL that quite simply isn't there though, as I earlier stated both ACL examples could easily equate to looking for any one of those four sets of triples.</p>
<p>However, this is the point - machine understanding of data is in the domain of the machine, the application doing the processing. And "truth" or "trust" is entirely context specific.</p>
<p>I'm increasingly convinced that the combined context of the data in a graph and the context under which that graph is being queried, specifies or infers in which direction you want to be reading, and directionality can be determined with linked data by dereferencing whichever uri you place on the left / in the subject position.</p>
<p>I recently found that Tim Berners-Lee wrote about this in a blog post entitled <a href="http://dig.csail.mit.edu/breadcrumbs/node/72">Backward and Forward links in RDF just as important</a>:</p>
<blockquote><p>One meme of RDF ethos is that the direction one choses for a given property is arbitrary: it doesn't matter whether one defines "parent" or "child"; "employee" or "employer". This philosophy (from the Enquire design of 1980) is that one should not favor one way over another. One day, you may be interested in following the link one way, another day, or somene else, the other way.</p></blockquote>
<p>Key here is the sentence "One day, you may be interested in following the link one way, another day, or somene else, the other way.", and that is exactly what all these examples are doing, following a link one way, or the other way.</p>
<p>To conclude this part, in every scenario thus far where I've thought I needed Named Graphs, it turns out that I in-fact needed directionality - and because I'm dealing with Linked Data, whatever I place in the subject position defines the URI which I need to dereference, and ultimately the Graph(s) which are considered when resolving the answer to the question being ASKed.</p>
<p>I'd thus suggest that "Named Graphs", do not exist in a web of data, they are needed in N3 and when using rules, because all data is often in a single file, however that is not the case for Linked Data, where we dereference.</p>
<h3>Back to SPARQL and Named Graphs</h3>
<p>Previously I mentioned the complications with the way we currently use named graphs in SPARQL and in our quad stores, where the URI we end up using could literally be, anything; and often we get duplicate data under different graphs.</p>
<p>To address this, I'd suggest that what we should be storing as the graph ?g value, is not some made up "named graph" but rather: <code>the dereferenced URI which we initially requested</code>.</p>
<ul>
<li>in the case of &lt;group#admins> this would be &lt;group>.
<li>in the case of &lt;group/admins> this would be &lt;group/admins></li>
</ul>
<p>To clarify, *never* the URI that a GET request finally resolves to, and *always* the initial dereferenced URI we requested.</p>
<p>The above ensure that we'd never have duplicate data in our quad stores again, that SPARQL queries including a FROM clause always dereferenced, that publishers and web server administrators were free to relocate and restructure their data, and ultimately make for a much nicer, healthier web of data.</p>
<p>Cool URIs don't change, and they wouldn't, just because the final document serializing a graph may move to a different URI, doesn't mean the original URI has to change.</p>
<h3>Conclusion</h3>
<p>Apologies for the length of the post, but I figured everything needed covered, in context. Simply put we need to focus less on Named Graphs (which IMHO aren't needed) and focus more on directionality. Every problem I've encountered thus far is covered by what Tim said years ago: "One day, you may be interested in following the link one way, another day, or somene else, the other way."</p>
<p>Comments?</p>
]]></content:encoded>
			<wfw:commentRss>http://webr3.org/blog/semantic-web/maybe-we-dont-need-named-graphs/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>linked data extractor prototype details</title>
		<link>http://webr3.org/blog/experiments/linked-data-extractor-prototype-details/</link>
		<comments>http://webr3.org/blog/experiments/linked-data-extractor-prototype-details/#comments</comments>
		<pubDate>Tue, 13 Apr 2010 18:53:43 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[experiments]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[linked data]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[virtuoso]]></category>
		<category><![CDATA[Computing]]></category>
		<category><![CDATA[DBpedia]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[extractor]]></category>
		<category><![CDATA[Open access]]></category>
		<category><![CDATA[World Wide Web]]></category>

		<guid isPermaLink="false">http://webr3.org/blog/?p=308</guid>
		<description><![CDATA[I recently released a prototype linked data semantic extraction demo which combines OpenCalais, Zemanta and Openlink Virtuoso to effectively categorize and work out what a given peice of text / document is about.
OpenCalais and Zemanta usage details and service comparison.
The demo leverages OpenCalais in order to pick up references to things, which are returned in [...]]]></description>
			<content:encoded><![CDATA[<p>I recently released a <a href="http://extractor.data.fm/?test">prototype linked data semantic extraction</a> demo which combines <a href="http://www.opencalais.com/">OpenCalais</a>, <a href="http://developer.zemanta.com/">Zemanta</a> and <a href="http://virtuoso.openlinksw.com/">Openlink Virtuoso</a> to effectively categorize and work out what a given peice of text / document is about.</p>
<h3>OpenCalais and Zemanta usage details and service comparison.</h3>
<p>The demo leverages OpenCalais in order to pick up references to things, which are returned in most cases as string literals; OpenCalais can also be configured to return back socialtags which give a broad stroke idea of what the document is about, again with string literal "tags". With regards the references (semantic metadata, Entities, Facts, Events etc.) which OpenCalais returns, whilst it is generally string literals, it also returns back vital Type and Relevance information, so in the case of "London" it will also assert that London is a City. Even in the case where it doesn't previously know what a thing is, it can work out that say "Frank Neverbeenheardofbefore" is a Person.</p>
<p>Zemanta is also leveraged, the primary difference between Zemanta and OpenCalais (and thus the need for both services) is that Zemanta focuses more on accurate tagging of text. Primarily though, Zemanta tags (again string literals) are meaningful tags which are commonly known and are referenced to either existing Linked Data identifiers such as http://dbpedia.org/resource/London and further information about the tag (or thing), in the case of the aforementioned London, then it will often also provide links to the wikipedia page for London, the official homepage to the city of London and a link to show the position of London on google maps.</p>
<p>I should point out that ever increasingly OpenCalais also returns back Linked Data too, for instance in the case of London they have given it an HTTP URI which can be dereferenced to retrieve more information about London. At a very crude estimation I would suggest that (depending on the subject matter) OpenCalais returns Linked Data URIs for about 15% of all references it finds to well known "things".</p>
<p>Weighing up the two services I couldn't say that one is better than the other, both have advantages and disadvantages, the only way to get a decent overall picture is to use both. for the benefits of feedback to both of these great services though, here is a general comparison:</p>
<p>note: none of these figures are from exact tests, they are from extensive developer usage of both services as I've used them both since they were made public.</p>
<p>Zemanta is generally 2x as fast for average texts (the size of this post for instance) and as much as 5x as fast for longer texts. Average for Zemanta being 0.7 to 2 seconds. Average for OpenCalais being 1.5 to 10 seconds. It may also be worth noting that the availability of Zemanta is somewhat higher than that of OpenCalais, perhaps 1 in 250 calls to OpenCalais will fail.</p>
<p>OpenCalais does a lot more heavy work than Zemanta though, and *really* semantically analyzes the text to figure out a wealth of information. In this respect the tables are completely turned and Zemanta consitently deals with providing a few high quality known tags; where as OpenCalais often provides at least 10x as much information about a given text, including relevance and type as mentioned before. OpenCalais also extracts Facts / Events, and further it can figure out that "Jim" is also "Jim Bob", and that Jim said X about Y on date D.</p>
<p>Generally you can trust the data from Zemanta 99% as it deals with "known" things, however due to this in some cases very new topics (such as IPad for the first few days after its announcement) remain unknown. Due to the nature of OpenCalais and it's dealing with the unknown you need to take more time to verify what it has returned, however when OpenCalais assigns a LinkedData identifier to something or provides more information you can 99.99% trust that it is entirely accurate.</p>
<p>It's worth noting that both of these services do different things though, and both do it extremely well, Zemanta "tags" and OpenCalais "semantically extracts information", in some respects I was hesitant about comparing the two, as in the context of what I'm doing both are needed and both are equal, however in different contexts both do different jobs and there is a need for people to select one over the other.</p>
<p>Out of all the competition though, I would highly recommend both Zemanta and OpenCalais over their respective competitors, and do hope that neither of these great services ever decide to target each others markets. (e.g. they compliment each other well and both do so well because they stick to what they are good at).</p>
<h3>extractor.data.fm details</h3>
<p>This demo deals primarily with figuring out what a document is about; in that it aims to provide back a list of:</p>
<ul>
<li><strong>Categories</strong><br />A list of 1-5 dbpedia (and therefore wikipedia) categories which the provided document would be categorized under if it were a wikipedia article and had been categorized by a huma who was knowledgeable in the subject domain(s) of the text.</li>
<li><strong>General Topics</strong><br />A short list of the general and broad-strokes Subjects covered by the document, these can are distinct from the primary specific subjects covered and the categories, and in many ways can be seen as the most common intersections between the primary specific subjects discussed.</li>
<li><strong>Primary Subjects</strong><br />These are the specific subjects covered in the document, not just the things mentioned, but the things actively discussed within the document, the primary subject matter as it were.</li>
<li><strong>Related or Mentioned Subjects</strong><br />Whilst I've termed them "related" as in dcterms:related, these are simply things which have been detected in the document or text and which are not primary subjects; in many ways "mentions" may be a more appropriate term.</li>
</ul>
<p>Out of the above list, the two services do the heavy lifting to give the demo it's Primary Subjects and Related Subjects; in short OpenCalais' SocialTags and Zemanta's Tags give us back our Primary Subjects. Whilst OpenCalais by way of the semantic extraction provide us with the Related Subjects, namely all those extracted semantics which have the Type of a real thing (not an IndustryTerm or Event) and which are not all ready a Primary Subject; additionally those extracted semantics which are not tags but have a relevance higher than a certain score are boosted up to be Primary Subjects too.</p>
<p>A primary and initial function of the demo is to associate the tags returned by both services together, and figure out when each is talking about the same thing; this is covered first by dealing with the linked data they return; where both services are talking about the same thing you simply know this unambiguously due to the nature of http URIs and them both being the "sameAs" each other. After this two chunks of unhandled data remain, Zemanta tags which have not determined to be the sameas OpenCalais ones; and OpenCalais semantics which we have a string literal name for and a type.</p>
<p>In step <a href="http://virtuoso.openlinksw.com/">Openlink Virtuoso</a> 6.1 (<a href="http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/VOSIndex">open source edition</a>!) with most of dbpedia 3.4 loaded in to do the heavy lifting from here on; Virtuoso is a really powerful bit of kit and has replaced  mysql/sql server/postgres, rdf store and web dav server in my typical server stack. The public lod and dbpedia endpoints really do no justice as to just how powerful and fast Virtuoso is, queries which take a few seconds on the public endpoint return in hundredths of a second on my local (low spec) server, and the comparative performance to the aforementioned RDBMS solutions is not to be sniffed at.</p>
<p>To handle the typed string literals from OpenCalais, I built a custom dbpedia lookup service (using sparql over the aforementioned Virtuoso + dbpedia setup) which tries to unambiguously determine the identifier for a string literal, if it is known; the results are pretty good and I'd safely say that it gets it right in 98% of cases. This essentially turns the remaining unknown string literals in to known Linked Data URIs, and as a side benefit gives the correct full Name for the thing identified along with the correct casing and obviously much more linked data.</p>
<p>Remaining now the demo has a few OpenCalais semantics which are still unknown, but we know the Type and have a name for the thing; and as URIs are given to things that can be Named, I simply mint my own uri's for these and specify the OpenCalais identifier as a "seeAlso" (to be future compatible with a time where they do associate there own hash uris through to linked data).</p>
<p>At this point the demo has all of the Primary Subjects and Related Subjects determined and where possible linked through to additional LinkedData and human readable web documents about the subjects.</p>
<h4>Categorization</h4>
<p>This is where the script comes in to it's own and really leverages virtuoso, up till this point it's all been about cleaning, validating, looking up, associating and suchlike.</p>
<p>Given that we now have linked data HTTP URIs for all the subjects we are dealing with, and in all Primary Subject cases we also have dbpedia.org URIs the demo can start to use some of Virtuoso's more powerful features. First point of call is to get the Category intersection of all primary subjects (including the inferred categories!) via a slightly complex transitive sparql query over the dbpedia dataset. From here the demo calculates a set of primary categories which the text is related to, then it finds the general category intersection (again including inferred categories) between the primary categories, and the primary subjects. with the results returned is a wealth of numerical information which the demo dually considers and can then infer which are the General Subjects and the Categories for the text.</p>
<p>At some point I'll cover this part of the script in more details and give some virtuoso specific transitive SPARQL queries for you to use in your own such creations, but for now the above will have to do.</p>
<h3>Conclusion</h3>
<p>This extractor demo is something I've been working on and trying to achieve for about 5 years, and whilst it is still early days it's the first time the technologies have been available to both make it possible, and to utilize the results correctly to achieve what I'm aiming for overall.</p>
<p>The overall goal is to create a system which allows users to simply drop in content, and the system "files" it in the correct categories, lists it under the correct subjects and interlinks it with other resource via typed links such as "related resources" and looser resource lists of "also mentioned here", further benefits of such a system are that you can accurately figure out what readers are interested in and promote new content to them, you can give users the option of content streams where they can watch specific subjects or combination of subjects to be notified of their "ideal" reading. On the flip side you can also identify users and contributers interests and expertise, and correlate these together (with geo-location) to suggest others users who they may wish to collaborate with, other organisations doing the same work in the same fields and many such uses. In reality I have much of this implemented in a site I've been working on for the last year, which is just being rolled out again, and the system works extremely well with huge benefits to all involved, the site you see deals with climate adaptation and both provides a service to the general adaptation community where they can share and find knowledge, and more importantly serves organisations working on critical issues by letting them see which people / organisations / projects are doing what, where and allows them to both co-ordinate efforts and perhaps more importantly not duplicate efforts and waste resources where it counts most. This has a positive impact on the worlds poorest nations and those suffering people who these organisations are trying to work with and help.</p>
<p>Back to the demo, and with the context described, the extractor.data.fm demo is a quick UI around an API which is in many ways the backbone of the aforementioned system. The API is used in a semi-automated way, where the data returned by it is verified in a UI by the content author / admins who remove any unambiguous data and then hit save, from there everything is automated again and the system functions as above.</p>
<p>I'm unsure whether this kind of system will ever be able to be fully automated (or whether its wise to allow this) as certain scenarios just can't be covered yet, a real life example of this is an initiative called "TEA", ambiguity at this level, and with entities which are unknown to systems or even the web of data, will always be an issue at some point, as things progress it may be they are only ambiguous once, on their first discovery, but that is still once; hence why this may always have to be a semi-automated process.</p>
]]></content:encoded>
			<wfw:commentRss>http://webr3.org/blog/experiments/linked-data-extractor-prototype-details/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>A lighter way to configure Apache for FOAF+SSL</title>
		<link>http://webr3.org/blog/optimization/a-lighter-way-to-configure-apache-for-foafssl/</link>
		<comments>http://webr3.org/blog/optimization/a-lighter-way-to-configure-apache-for-foafssl/#comments</comments>
		<pubDate>Fri, 02 Apr 2010 17:31:00 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[apache]]></category>
		<category><![CDATA[linked data]]></category>
		<category><![CDATA[optimization]]></category>
		<category><![CDATA[Apache Corporation]]></category>
		<category><![CDATA[Computer networking]]></category>
		<category><![CDATA[Computing]]></category>
		<category><![CDATA[Cryptographic protocols]]></category>
		<category><![CDATA[Electronic commerce]]></category>
		<category><![CDATA[FOAF]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[Internet protocols]]></category>
		<category><![CDATA[Secure communication]]></category>
		<category><![CDATA[SSL]]></category>
		<category><![CDATA[Technology/Internet]]></category>
		<category><![CDATA[Transport Layer Security]]></category>

		<guid isPermaLink="false">http://webr3.org/blog/?p=303</guid>
		<description><![CDATA[Just a snippet post to say that I've found a lighter (and imho preferable) way to configure Apache to accept client side SSL certificates (with regards to FOAF+SSL).
The Standard Way
This way essentially exports all SSL data, certs, client and server side if you read the notes has performance penalty.

   SSLVerifyClient optional_no_ca
   [...]]]></description>
			<content:encoded><![CDATA[<p>Just a snippet post to say that I've found a lighter (and imho preferable) way to configure Apache to accept client side SSL certificates (with regards to FOAF+SSL).</p>
<p><strong>The Standard Way</strong><br />
This way essentially exports all SSL data, certs, client and server side if you read the notes has performance penalty.<br />
<code><br />
   SSLVerifyClient optional_no_ca<br />
   SSLVerifyDepth 1<br />
   SSLOptions +StdEnvVars<br />
   SSLOptions +ExportCertData<br />
</code></p>
<p><strong>The Lighter Way</strong><br />
This way simply passes in the SSL_CLIENT_CERT in to the env REMOTE_USER and skips the rest which you don't use (for FOAF+SSL).<br />
<code><br />
   SSLVerifyClient optional_no_ca<br />
   SSLVerifyDepth 1<br />
   SSLUserName SSL_CLIENT_CERT<br />
</code></p>
<p>Tested and works very nicely (again, imho).</p>
<p>note: Enabling SSLOptions +FakeBasicAuth will overwrite this with the Subject from the client side certificate.</p>
]]></content:encoded>
			<wfw:commentRss>http://webr3.org/blog/optimization/a-lighter-way-to-configure-apache-for-foafssl/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Restarting Linked Data from scratch, part 2</title>
		<link>http://webr3.org/blog/linked-data/restarting-linked-data-from-scratch-part-2/</link>
		<comments>http://webr3.org/blog/linked-data/restarting-linked-data-from-scratch-part-2/#comments</comments>
		<pubDate>Fri, 12 Mar 2010 02:02:22 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[linked data]]></category>
		<category><![CDATA[atompub style protocol]]></category>
		<category><![CDATA[author]]></category>
		<category><![CDATA[Computing]]></category>
		<category><![CDATA[Content negotiation]]></category>
		<category><![CDATA[DTD]]></category>
		<category><![CDATA[FOAF]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[HTTP]]></category>
		<category><![CDATA[Los Angeles]]></category>
		<category><![CDATA[manager]]></category>
		<category><![CDATA[Markup languages]]></category>
		<category><![CDATA[rdf]]></category>
		<category><![CDATA[RDFa]]></category>
		<category><![CDATA[representation chain]]></category>
		<category><![CDATA[Representational State Transfer]]></category>
		<category><![CDATA[Resource]]></category>
		<category><![CDATA[Roy T. Fielding]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[Technology/Internet]]></category>
		<category><![CDATA[web browsers]]></category>
		<category><![CDATA[Web standards]]></category>
		<category><![CDATA[World Wide Web]]></category>
		<category><![CDATA[XML]]></category>

		<guid isPermaLink="false">http://webr3.org/blog/?p=278</guid>
		<description><![CDATA[This post is part of a series, following on from my earlier post Restarting Linked Data from scratch, part 1. In this post I'm going to take the first step by trying to approach publishing and exposing linked data RESTfully.
I'm assuming that if you are reading this, you know what linked data is, and REST [...]]]></description>
			<content:encoded><![CDATA[<p>This post is part of a series, following on from my earlier post <a href="http://webr3.org/blog/linked-data/restarting-linked-data-from-scratch-part-1/">Restarting Linked Data from scratch, part 1</a>. In this post I'm going to take the first step by trying to approach publishing and exposing linked data RESTfully.</p>
<p>I'm assuming that if you are reading this, you know what linked data is, and REST as per the dissertation of Roy T. Fielding. If not go do some reading :)</p>
<h3>Interface Constraints</h3>
<p>REST is defined by four interface constraints:</p>
<ol>
<li>identification of resources</li>
<li>manipulation of resources through representations</li>
<li>self-descriptive messages</li>
<li>hypermedia as the engine of application state.</li>
</ol>
<p>From here I'll look at each of these four constraints and build up the approach as I go.</p>
<h3>What a resource is</h3>
<p>Quoting extensively from <a href="http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm#sec_5_2_1_1">  REST 5.2.1.1 Resources and Resource Identifiers</a>:</p>
<blockquote><p>The key abstraction of information in REST is a resource. Any information that can be named can be a resource: a document or image, a temporal service (e.g. "today's weather in Los Angeles"), a collection of other resources, a non-virtual object (e.g. a person), and so on. In other words, any concept that might be the target of an author's hypertext reference must fit within the definition of a resource...</p>
<p>A resource is a conceptual mapping to a set of entities, not the entity that corresponds to the mapping at any particular point in time...</p>
<p>The values in the set may be resource representations and/or resource identifiers...</p>
<p>A resource can map to the empty set, which allows references to be made to a concept before any realization of that concept exists...</p>
<p>The only thing that is required to be static for a resource is the semantics of the mapping, since the semantics is what distinguishes one resource from another...
</p></blockquote>
<h3>What a representation is</h3>
<p>Again, quoting extensively from <a href="http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm#sec_5_2_1_2">  REST 5.2.1.2 Representations</a>:</p>
<blockquote><p>
... A representation is a sequence of bytes, plus representation metadata to describe those bytes. Other commonly used but less precise names for a representation include: document, file, and HTTP message entity, instance, or variant...</p>
<p>If the value set of a resource at a given time consists of multiple representations, content negotiation may be used to select the best representation for inclusion in a given message...</p>
<p>The data format of a representation is known as a media type
</p></blockquote>
<h2>Identification of resources</h2>
<p>To do this properly I need to identify some resources, so for this I'm going to work with "Something" :)</p>
<ul>
<li><strong>"Something"</strong> - a resource, a non-virtual object</li>
</ul>
<p>At any point in time I have a description of Something which has multiple representations in different mediatypes, all semantically matching or equivalent:</p>
<ul>
<li><strong>"something.rdf"</strong> - representation of Something with mediatype RDF+XML</li>
<li><strong>"something.n3"</strong> - representation of Something with mediatype  RDF+N3</li>
<li><strong>"something.en.html"</strong> - representation of Something, in english, with mediatype text/html</li>
<li><strong>"something.de.html"</strong> representation of Something, in german, with mediatype text/html</li>
</ul>
<p>Each one of those representations is also a resource because they can be the target of a hyperlink. Of course by resource I mean a conceptual mapping to each of the things listed, and I haven't assigned URIs but will..</p>
<p>To be able to make this set of representations manageable and to indicate they are in a set, I'm going to add in another resource which is a collection of resources, which can be considered a set of these equivalent representations of Something at a fixed point in time. For the purpose of this exercise, that point in time is today.</p>
<ul>
<li><strong>"Something-20100311"</strong> - a resource which is a collection of equivalent representations of Something on the 11th March 2010.</li>
</ul>
<p>Additionally, for the sake of argument, I'm going to say that a new set of representations (or version) is added every day - to handle this I then need one more resource, a collection of resources, where each resource in the collection is itself a collection of resources (<em>one of the aforementioned and including the example "Something-20100311"</em>). This will give me a conceptual mapping which covers time, and therefore everything I could need.</p>
<ul>
<li><strong>"Somethings"</strong> - a resource which is a collection of resources, see above for full description!</li>
</ul>
<p>Finally, I'm going to add in two shortcut resources which have no representation and are simply conceptual maps to the first and most current sets of representations.</p>
<ul>
<li><strong>"first"</strong> - a resource which always maps to the first collection of representations of Something.</li>
<li><strong>"latest"</strong> - a resource which maps to the most recent collection of representations of Something.</li>
</ul>
<h4>Giving the resources URIs</h4>
<p>Now to assign some URIs for this use case, there is no set structure and I'm not going to define one because it is up to each server (or manager of) to control it's own URI space, but for the sake of this exercise I'll define mine as follows:</p>
<p><code><br />
base: http://data.webr3.org<br />
  ...<br />
  /d/Something<br />
  /rg/Somethings<br />
  /rg/Somethings/first<br />
  /rg/Somethings/latest<br />
  /rg/Somethings/Something-20100311<br />
  /rg/Somethings/Something-20100311/something.rdf<br />
  /rg/Somethings/Something-20100311/something.n3<br />
  /rg/Somethings/Something-20100311/something.en.html<br />
  /rg/Somethings/Something-20100311/something.de.html<br />
  ...<br />
  /rg/Somethings/Something-20100305<br />
  /rg/Somethings/Something-20100305/something.rdf<br />
  /rg/Somethings/Something-20100305/something.n3<br />
  /rg/Somethings/Something-20100305/something.en.html<br />
  /rg/Somethings/Something-20100305/something.de.html<br />
  ...<br />
</code></p>
<p>From the above you can see that every possible representation has its own URI, in addition every collection of equivalent representations has its own URI, as does the collection of all those collections; and so does "Something" our non virtual object.</p>
<p>Also we've exposed multiple resources which could also be RESTful CRUD access points operating on an atompub style protocol. Small sentence, big potential, will cover approaches and protocols in later posts.</p>
<h2>The Key resource</h2>
<p>The most important thing, which I haven't yet covered, is that we've exposed a key resource, namely <code>/rg/Somethings</code>. This is a resource at the top of the representation chain which can be used to expose content negotiation, be it server or agent driven (or a mix of both), and regardless of the mappings and levels of collection further down the line this can always be a single point of entry to get representations.</p>
<p>I'll cover just how in a moment, but for now something important.</p>
<h3>Important</h3>
<p>I've had to give a fixed example just to make some progress, but we have to remember that every system has different needs, in some cases it may be that there is only a single fixed representation for a resource, whilst in others each strand of representation (like something.de.html) may take it's own versioning / temporal path. This could indicate that a structure such as the following may be in order:<br />
<code><br />
  ...<br />
  /d/Something<br />
  /rg/Somethings<br />
  /rg/Somethings/first<br />
  /rg/Somethings/latest<br />
  /rg/Somethings/Something-20100311<br />
  /rg/Somethings/Something-20100305<br />
  /rg/Somethings/Something-rdf<br />
  /rg/Somethings/Something-rdf/20100311.rdf<br />
  /rg/Somethings/Something-rdf/20100305.rdf<br />
  /rg/Somethings/Something-html-en<br />
  /rg/Somethings/Something-html-en/20100311.html<br />
  /rg/Somethings/Something-html-en/20100305.html<br />
  /rg/Somethings/Something-html-de<br />
  /rg/Somethings/Something-html-de/20100308.html<br />
  /rg/Somethings/Something-html-de/20100303.html<br />
</code></p>
<p>The above highlights that whilst we may have added more resources, the core resources are still the same; remember that they are "conceptual maps", meaning that Something-20100311 may "map" to the version of en-html on the 11th and de-html on the 8th, because the de version was written first, then translated to english and from there rdf and so forth, but they are all semantically equivalent, containing the same information even though they were created at different times.</p>
<p>The Conceptual Maps are as follows, from what I can tell this should always cover any use-case, no matter how complex.</p>
<p><code><br />
Thing 1-1 CollectionOfCollections<br />
CollectionOfCollections 1-* CollectionOfEquivalentRepresentations<br />
CollectionOfEquivalentRepresentations 1-* Representation<br />
</code></p>
<p>aside:<em>At times like this I wish I'd had a chance to study computer science so that I could express these things formally, so you'll have to make sense of it as best you can :( sorry.</em></p>
<h2>Exposing via Content Negotiation</h2>
<p>In my research so far, I've been able to figure out how to expose all of the aforementioned via HTTP, RESTfully using content negotiation in a manner which seems to be transparent to existing web browsers, but exposes all the information needed in a manner that is visible to machines; without using any additional extensions headers. As follows:</p>
<p><strong>1</strong> The client does a normal GET request on our "Something", notice that no content negotiation is happening yet, we are simply asserting via a 303 "that the requested resource does not have a representation of its own that can be transferred by the server over HTTP."<br />
<code><br />
#Request<br />
GET /d/Something HTTP/1.1<br />
Host: data.webr3.org<br />
Accept: text/html;q=0.5, application/rdf+xml<br />
<br/><br />
#Response<br />
HTTP/1.1 303 See Other<br />
Location: http://data.webr3.org/rg/Somethings<br />
</code></p>
<p><strong>2</strong>The client does a GET on the URI we specified in the Location field, namely to our key resource that can be used for content negotiation over all the representations.<br />
<code><br />
#Request<br />
GET /rg/Somethings HTTP/1.1<br />
Host: data.webr3.org<br />
Accept: text/html;q=0.5, application/rdf+xml<br />
<br/><br />
#Response<br />
HTTP/1.1 300 Multiple Choices<br />
Location: http://data.webr3.org/rg/Somethings/latest<br />
Content-Type: application/xhtml+xml<br />
Content-Length: 17400<br />
<br/><br />
&lt;?xml version="1.0" encoding="UTF-8"?><br />
&lt;!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN"<br />
    "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd"><br />
&lt;html xmlns="http://www.w3.org/1999/xhtml"<br />
    xmlns:foaf="http://xmlns.com/foaf/0.1/"<br />
    xmlns:dc="http://purl.org/dc/elements/1.1/"<br />
    version="XHTML+RDFa 1.0" xml:lang="en"><br />
...<br />
</code></p>
<p>Here's where it gets interesting and clients can take different routes; first the route of the typical user agent:</p>
<h3>User Agent Route</h3>
<p><code><br />
#Request<br />
GET /rg/Somethings/latest HTTP/1.1<br />
Host: data.webr3.org<br />
Accept: text/html;q=0.5, application/rdf+xml<br />
<br/><br />
#Response<br />
HTTP/1.1 307 Temporary Redirect<br />
Location: http://data.webr3.org/rg/Somethings/Something-20100311<br />
<br/><br />
#Request<br />
GET /rg/Somethings/Something-20100311 HTTP/1.1<br />
Host: data.webr3.org<br />
Accept: text/html;q=0.5, application/rdf+xml<br />
<br/><br />
#Response<br />
HTTP/1.1 302 Found<br />
Vary: Accept<br />
ETag: W/"xyzzy"<br />
Last-Modified: Wed, 11 Mar 2010 12:45:26 GMT<br />
Content-Type: application/xhtml+xml<br />
Content-Length: 17400<br />
Content-Language: en<br />
Content-Location: http://data.webr3.org/rg/Somethings/Something-20100311/something.en.html<br />
<br/><br />
&lt;!DOCTYPE html...<br />
</code></p>
<p>First you can see that the user agent simply goes straight through to the most recent content and what they expect to see; which is nice, with additional Server driven content negotiation.</p>
<p>Further, we can see that full cache control is in there which as we know speeds up the net, and further still we have a rather nifty "weak" entity tag; this entity tag is shared by all representations which are semantically equal, and asserts they are equal via the entity tag. It's also worth noting that you could add this entity tag to your RDF graphs and further assert provenance which could come in very handy down the line for POST and PUT implementations.</p>
<p>To recap, common user agents just go straight through to the expected resource via server driven content negotiation and can take full advantage of cache / control data.</p>
<h3>The Machine Route</h3>
<p>Back at <strong>2</strong> the server returned a <code>300 Multiple Choices</code> as soon as <code>/rg/Somethings</code> was requested. All important was that the entity returned was XHTML+RDFa (<em>although this could have been Atom or similar..</em>), which means we can give both a human and machine readable list of all our various representations, the "machine" can then select which one it finds most fitting. The choices could be expressed using any suitable ontology; and further both <code>Alternative</code> and <code>Link</code> headers could be added if publishers wished.</p>
<p>I <em>think</em> that covers it all, if there are any errors or things I've missed please do let me know asap; but for now that'll do me - it's verbose, but I like verbose - prove it works then optimise it later :)</p>
]]></content:encoded>
			<wfw:commentRss>http://webr3.org/blog/linked-data/restarting-linked-data-from-scratch-part-2/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Restarting Linked Data from scratch, part 1</title>
		<link>http://webr3.org/blog/linked-data/restarting-linked-data-from-scratch-part-1/</link>
		<comments>http://webr3.org/blog/linked-data/restarting-linked-data-from-scratch-part-1/#comments</comments>
		<pubDate>Thu, 11 Mar 2010 07:41:51 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[linked data]]></category>
		<category><![CDATA[ajax]]></category>
		<category><![CDATA[average web page]]></category>
		<category><![CDATA[Cache]]></category>
		<category><![CDATA[caching]]></category>
		<category><![CDATA[client / server]]></category>
		<category><![CDATA[Computing]]></category>
		<category><![CDATA[Control Data Systems Inc]]></category>
		<category><![CDATA[HTTP]]></category>
		<category><![CDATA[important web caching]]></category>
		<category><![CDATA[Metadata]]></category>
		<category><![CDATA[Query languages]]></category>
		<category><![CDATA[rdf]]></category>
		<category><![CDATA[real web]]></category>
		<category><![CDATA[Representational State Transfer]]></category>
		<category><![CDATA[Resource]]></category>
		<category><![CDATA[RESTful protocol]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[SPARQL]]></category>
		<category><![CDATA[SPARUL]]></category>
		<category><![CDATA[Technology/Internet]]></category>
		<category><![CDATA[traffic site]]></category>
		<category><![CDATA[web application super fast]]></category>
		<category><![CDATA[web applications using desktop clients]]></category>
		<category><![CDATA[Web browser]]></category>
		<category><![CDATA[web server]]></category>
		<category><![CDATA[web servers]]></category>
		<category><![CDATA[Web services]]></category>
		<category><![CDATA[World Wide Web]]></category>

		<guid isPermaLink="false">http://webr3.org/blog/?p=269</guid>
		<description><![CDATA[I'm going out on a limb and starting my whole journey through Linked Data and "Web 3.0" again - in order to address the challenges many in the community are facing, and which are "blocking" me. I'm going to take everything I've learned so far and go right back to grass roots with linked data.
I'm [...]]]></description>
			<content:encoded><![CDATA[<p>I'm going out on a limb and starting my whole journey through Linked Data and "Web 3.0" again - in order to address the challenges many in the community are facing, and which are "blocking" me. I'm going to take everything I've learned so far and go right back to grass roots with linked data.</p>
<p>I'm primarily documenting this journey for my own benefit, for reference and to unload it from my brain; but hopefully it'll be of use to the wider community and any feedback will be massively appreciated.</p>
<p>Here goes, I'll start by analysing the web thus far:</p>
<h2>The Web till now</h2>
<p>The power and the success of the web so far, <em>in my opinion</em>, has come from four main things:</p>
<ul>
<li>The URI</li>
<li>Hyperlink</li>
<li>The Resource</li>
<li>HTTP and it's RESTful design.</li>
</ul>
<p>More importantly it's the combination of the four working together that makes the web so great, because, they let you cut out everything else and go straight to the resource you want. This is a point that we need to concentrate on for a minute.</p>
<h3>Going straight to the resource</h3>
<p>Back at the start of the web, this allowed people to (for the first time) jump from a resource in the bowels of one companies hierarchy straight to another resource in a different companies hierarchy - very much like reading a book, looking at the references and suddenly having the book or paper it references right there in your lap - amazing to say the least.</p>
<p>Skip forwards a few years and we have the search engines, suddenly simply by typing a few keywords you can jump right to a resource (page/image/...) anywhere on the web. Fast forward some more and we get to web 2.0 where again people are amazed every time direct access to a resource is exposed. Yup.. all your web 2.0 is just this simple principal..</p>
<p>An RSS feed, well it lets you read a resource (a post) outside the context of the website and inside lets say google reader. You can rip a resource (video) out of youtube and embed it anywhere you like. You can interact with a web application super fast thanks to targeting a resource directly with say ajax and only updating that resource rather than updating the entire view (page). You can interact with web applications using desktop clients because they let you access one resource at a time; and so on and so forth, virtually every improvement you see on the web comes down to that one thing, directly accessing a resource (<em>and creating resources at a more granular level</em>).</p>
<h3>How we made the web <em>faster</em></h3>
<p>Negating the rather obvious upgrades in technology over the years, there is one primary thing that speeds up the web and virtually everything computer oriented, the cache.</p>
<p>Before all the web 2.0 stuff, resource caching was at an all time high and was making the web faster for all of us; caching at the resource level is enabled by HTTP and its RESTful design. Control Data allows us to limit how much information needs transferred over the web, request an image once and it gets transferred, request it a second time and thanks to caching and HTTP odds are very high that it won't get transferred again. When you consider that the average web page can easily have 30, 50, 300 images and static files embedded in it this is a huge speed increase, and frankly one we could not live without.</p>
<p>Skipping forwards to web 2.0 and the present day again, we've gone wild with caching; anybody who's been involved with a high traffic site will tell you that the only way to do it is to cache everything you can; from data in memory, through to code and op code caches. But this is only half the story.. a strange this has happened..</p>
<h3>How we made the web <em>slower</em></h3>
<p>Simply, we forgot HTTP and a RESTful web somehow - that all important web caching whether it be at intermediate servers or in a web browser, it's forgotten.</p>
<p>To illustrate, if you view an image and then view it again, it'll be there instantly - why? because last-modified, etag and other control data is sent by great web servers like apache for static files, until you force a refresh or the file changes on the server you'll simply get a 304 response telling whatever cache down the line to use it's own copy instead. Now, try jumping on to a web page, even this one and you'll find the whole thing is reloaded, every time. I'd estimate that circa 80% of all pages you visit are fully reloaded every single time you see them, if not more.</p>
<p>Here's the reason - most pages are generated by scripts now, and something that goes unnoticed by most developers is that the web server (like apache) hand over *full* control to the language runtimes, and in turn to the developer. In other words, unless developers are calculating, receiving and sending control data for each response, and validating every http message in to their scripts, then most of the benefits of HTTP and RESTful design are completely lost; <em>especially caching</em>.</p>
<p>Here's a fact, whether you agree with it or not, to me it is a fact: <em>the web has to be RESTful for it to work properly</em>, whether it's a web of documents, or a web of data, or both.</p>
<h2>Looking at the current state of Linked Data</h2>
<p>Linked Data is amazing because it takes the big four I mentioned earlier (URI, Hyperlink, Resource, HTTP) to a new level; we create resources at the most granular level possible, assign them URIs, link them together with <em>typed</em> hyperlinks then expose them via HTTP.</p>
<p>Notice I didn't mention REST in there? that's because I (and I'm not the only one) don't feel that Linked Data is currently RESTful. And as we can learn from web 2.0, unless this is addressed we'll face major problems down the line. In addition, because of this lack of RESTful-ness I feel like the data isn't linked; simply using URIs from different datasets on both sides of a triple does not link those datasets, well not from a client perspective anyway.</p>
<p>To expand and refining the issues:</p>
<h3>SPARQL Silos</h3>
<p>Issue one, is that SPARQL and the servers with RDF stores which power it are positioned at the wrong side of the client / server relationship imho. Because each major dataset effectively has it's own server and access point (<em>SPARQL interface</em>) it means that when you query it, it can only return the Linked Data which it stores. This leaves us three options at the minute:</p>
<ul>
<li>let that server pull in remote Linked Data and store it too (which makes the server fill up and slow down, and turns it in to a silo).</li>
<li>use one great big server that tries to store <em>all</em> the linked data (which feels like a silo all over again to me, not distributed at all).</li>
<li>Run our own server and only store limited data in it (limited.. and again a silo I guess)</li>
</ul>
<p>If we moved SPARQL to the client side however, then all it would need is a starting point from which it could traverse the web of data, only pulling in what it needed for a query. This may sound slow but if all data was exposed as resources like it should be, and with control data so it could be cached, this slow down would soon disappear; lesson from web 1 and 2!</p>
<p>With regards the caching, this could happen at traditional intermediate caches within the internet and at ISPs, locally in client side triples stores (like a browsers cache) or the existing big servers that attempt to store all the linked data could be repositioned as linked data caches.</p>
<p>For example a small RDF document could simply delegate seeAlso http://datacac.he/http://subject.uri and that linked data cache could return back all the information it knows about the subject by returning the RDF results of a SPARQL describe. This alone would be a HUGE speed up, prevent silos and create a real web of data.</p>
<p>In addition, this would keep all linked data transferred through the web in RDF format, and thus machine readable and typed. At the minute we have lots and lots of SPARQL queries, which essentially are just untyped junk data that a machine couldn't possibly understand - SPARQL results remove all the goodness from RDF and give us something that is domain and developer specific, not re-usable. Think about that for a moment..</p>
<p>Clarification: I'm not saying SPARQL + RDF stores shouldn't be on the server side, they should as they are needed in most cases, I'm simply saying that the primary interface to linked data shouldn't be SPARQL over HTTP to a remote SPARQL endpoint. Rather we should be accessing RDF documents, or entities if you like via HTTP.</p>
<h3>RESTful RDF</h3>
<p>Issue two, the focus has been on getting data on the web, finding ways to link it, access it, store and query these vast datasets; and the work done thus far is amazing! But now that's handled it's time to go back to basics and find ways of both getting and publishing Linked Data RESTfully, at a granular per resource level.</p>
<p>This means handling RDF like ATOM, and essentially making atompub all over for RDF (as many are thinking and working on). I feel that regardless of what's implemented behind the interface, and whether triples stores and SPARUL are used, we still need to manage RDF / Linked Data in terms of documents and entities for it to be RESTful.</p>
<p>An additional issue raised by this is loosing the notion of a quad, g s p o, Named Graphs are vastly important to linked data, but we need to get named graphs in to triples and out of quads so we are always working with RDF through HTTP.</p>
<p>Also worth noting that temporal, provenance, multi-language, multiple representations etc will all need handled too; without using any HTTP extensions; no point half baking it or making the solution dependant on drafts - needs to work with the Universal Interface!</p>
<h2>First Step</h2>
<p>The above means one of the first challenges and things I'll try to tackle, is to find a way to fit a RESTful RDF publishing and exposing protocol in to the shared http space on a web server; taking in to account things like content negotiation, multiple representations, versions / time varying representations, and backwards compatibility with the current web.</p>
<p>note: I'm not going to define the protocols, plenty of more intelligent people than me are working on these things, just leverage a space where a full RESTful protocol can work in unison with the way we currently do things so that it's transparent to browsers and visible to linked data clients. This should then allow a stable test environment to try out different ways of doing things and test that the current web doesn't break.</p>
<p>To be continued.. often and frequently. <em>I'm blocked on my current, v important project, and need to address these things</em>.</p>
]]></content:encoded>
			<wfw:commentRss>http://webr3.org/blog/linked-data/restarting-linked-data-from-scratch-part-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>HTTP RFC paraphrased for the Web of Data</title>
		<link>http://webr3.org/blog/semantic-web/http-rfc-paraphrased-for-the-web-of-data/</link>
		<comments>http://webr3.org/blog/semantic-web/http-rfc-paraphrased-for-the-web-of-data/#comments</comments>
		<pubDate>Tue, 02 Mar 2010 05:16:44 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[linked data]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[application-level protocol]]></category>
		<category><![CDATA[Computer networking]]></category>
		<category><![CDATA[Computing]]></category>
		<category><![CDATA[Dereferenceable Uniform Resource Identifier]]></category>
		<category><![CDATA[diverse applications]]></category>
		<category><![CDATA[generic protocol]]></category>
		<category><![CDATA[HTTP]]></category>
		<category><![CDATA[HTTP Protocol]]></category>
		<category><![CDATA[hypermedia information systems]]></category>
		<category><![CDATA[Hypertext Transfer Protocol]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[Internet protocols]]></category>
		<category><![CDATA[Internet systems]]></category>
		<category><![CDATA[rdf]]></category>
		<category><![CDATA[Representational State Transfer]]></category>
		<category><![CDATA[Resource]]></category>
		<category><![CDATA[Technology/Internet]]></category>
		<category><![CDATA[Uniform Resource Identifier]]></category>
		<category><![CDATA[Uniform Resource Locator]]></category>
		<category><![CDATA[URI scheme]]></category>
		<category><![CDATA[web of data]]></category>
		<category><![CDATA[World Wide Web]]></category>

		<guid isPermaLink="false">http://webr3.org/blog/?p=213</guid>
		<description><![CDATA[This post is all about gleaning as much useful information as possible from the HTTP Protocol RFC 2616 in order to answer simple and complex Web of Data related questions.
I've chosen the rather old RFC 2616 (1999!) at this time rather than the upcoming HTTPbis because I feel it's important to know where you are [...]]]></description>
			<content:encoded><![CDATA[<p>This post is all about gleaning as much useful information as possible from the <a href="http://www.w3.org/Protocols/rfc2616/rfc2616.html">HTTP Protocol RFC 2616</a> in order to answer simple and complex Web of Data related questions.</p>
<p>I've chosen the rather old <a href="http://www.w3.org/Protocols/rfc2616/rfc2616.html">RFC 2616</a> (1999!) at this time rather than the upcoming <a href="http://tools.ietf.org/wg/httpbis/">HTTPbis</a> because I feel it's important to know where you are coming from, and whilst many things about the Web of Data feel new, they are really age old principals and technologies which have never been used to their full potential. Further you won't be able to appreciate the refinements in HTTPbis if you don't know what it's refining.</p>
<p>Virtually everything from here on is just a snippet/quote or paraphrase of the RFC. Let's start with a simple one:</p>
<p><strong>Why use HTTP?</strong></p>
<blockquote><p>HTTP is an application-level protocol for distributed, collaborative, hypermedia information systems. ... HTTP allows an open-ended set of methods and headers that indicate the purpose of a request. ... HTTP is also used as a generic protocol for communication between user agents and proxies/gateways to other Internet systems ... HTTP allows basic hypermedia access to resources available from diverse applications. <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec1.html#sec1.1">source</a></p></blockquote>
<p>I do fully recommend reading the entire RFC and the new <a href="http://tools.ietf.org/wg/httpbis/">HTTPbis</a>, most questions can be answered by returning to these documents and reading what they say (it's all in the detail); here's some more info gleaned from the RFC:</p>
<p><strong>The difference between POST and PUT, URIs as Identifiers, and URIs to identify more than just documents.</strong></p>
<blockquote><p>The fundamental difference between the POST and PUT requests is reflected in the different meaning of the Request-URI. The URI in a POST request identifies the resource that will handle the enclosed entity. That resource might be a data-accepting process, a gateway to some other protocol, or a separate entity that accepts annotations. In contrast, the URI in a PUT request identifies the entity enclosed with the request -- the user agent knows what URI is intended .. <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.6">source</a></p></blockquote>
<p><strong>Using POST RESTfully for more than just form data</strong>.</p>
<blockquote><p>The POST method is used to request that the origin server accept the entity enclosed in the request as a new subordinate of the resource identified by the Request-URI in the Request-Line. POST is designed to allow a uniform method to cover the following functions: Annotation of existing resources; ... Extending a database through an append operation. The actual function performed by the POST method is determined by the server and is usually dependent on the Request-URI. The posted entity is subordinate to that URI in the same way that ... a record is subordinate to a database. <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.5">source</a></p></blockquote>
<p><strong>What to do if something is created as a result of a POST request</strong>.</p>
<blockquote><p>If a resource has been created on the origin server, the response SHOULD be 201 (Created) and contain an entity which describes the status of the request and refers to the new resource, and a Location header. <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.5">source</a></p></blockquote>
<p><strong>When to use a PUT request?</strong></p>
<blockquote><p>The PUT method requests that the enclosed entity be stored under the supplied Request-URI. <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.5">source</a></p></blockquote>
<p><strong>How to handle a PUT</strong></p>
<blockquote><p>If the Request-URI refers to <strong>an already existing resource</strong>, the enclosed entity SHOULD be considered as a modified version of the one residing on the origin server.</p>
<p>If the Request-<strong>URI does not point to an existing resource</strong>, and that URI is capable of being defined as a new resource by the requesting user agent, the origin server can create the resource with that URI. If a new resource is created, the origin server MUST inform the user agent via the 201 (Created) response. If an existing resource is modified, either the 200 (OK) or 204 (No Content) response codes SHOULD be sent. <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.5">source</a></p></blockquote>
<p><strong>if extra headers were sent?</strong></p>
<blockquote><p>Unless otherwise specified for a particular entity-header, the entity-headers in the PUT request SHOULD be applied to the resource created or modified by the PUT. <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.5">source</a></p></blockquote>
<p><strong>and what if I want to save it somewhere other than the URI specified by the client?</strong></p>
<blockquote><p> If the server desires that the request be applied to a different URI, it MUST send a 301 (Moved Permanently) response. <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.5">source</a></p></blockquote>
<p><strong>and if the PUT can't be done..</strong></p>
<blockquote><p>If the resource could not be created or modified with the Request-URI, an appropriate error response SHOULD be given that reflects the nature of the problem. <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.5">source</a></p></blockquote>
<p><strong>can I use PUT with server side versioning?</strong></p>
<blockquote><p>A single resource MAY be identified by many different URIs. For example, an article might have a URI for identifying "the current version" which is separate from the URI identifying each particular version ... a PUT request on a general URI might result in several other URIs being defined by the origin server. <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.5">source</a></p></blockquote>
<p><strong>how would I let a client know I implement server side versioning when they PUT?</strong></p>
<blockquote><p>If an existing resource is modified, either the 200 (OK) or 204 (No Content) response codes SHOULD be sent.. <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.5">source</a></p></blockquote>
<p>200 indicates a message body in the response ;)</p>
<p><strong>and DELETE?</strong><br />
well it's short so you may as well read it all..</p>
<blockquote><p> The DELETE method requests that the origin server delete the resource identified by the Request-URI. This method MAY be overridden by human intervention (or other means) on the origin server. The client cannot be guaranteed that the operation has been carried out, even if the status code returned from the origin server indicates that the action has been completed successfully. However, the server SHOULD NOT indicate success unless, at the time the response is given, it intends to delete the resource or move it to an inaccessible location.</p>
<p>A successful response SHOULD be 200 (OK) if the response includes an entity describing the status, 202 (Accepted) if the action has not yet been enacted, or 204 (No Content) if the action has been enacted but the response does not include an entity. <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.7">source</a></p></blockquote>
<p>note that a response can be 200, meaning you can return a response message (like i have X other versions here [list] or delete them all by clicking here [form input which POSTs to a service] ), or an RDF response that can be interpreted by a client to do the aforementioned :)</p>
<p><strong>but can't I tunnel all actions through GET?</strong></p>
<blockquote><p>Safe Method .. GET .. SHOULD NOT have the significance of taking an action other than retrieval! <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.1.1">source</a></p></blockquote>
<p><em><strong>edit</strong>: removed small section about URI vs URL! Do however see the <a href="http://webr3.org/blog/semantic-web/http-rfc-paraphrased-for-the-web-of-data/comment-page-1/#comment-231">comment</a> from <a href="http://sw-app.org/about.html">Michael</a> which links to more information on the subject.</em></p>
<p><strong>Thanks for reading :)</strong><br />
There is much more information in the RFC, but those were some nicer points I found useful and relevant to current Web of Data topics &#038; discussions.</p>
]]></content:encoded>
			<wfw:commentRss>http://webr3.org/blog/semantic-web/http-rfc-paraphrased-for-the-web-of-data/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Virtuoso 6, SPARQL + GEO, Sample Queries</title>
		<link>http://webr3.org/blog/linked-data/virtuoso-6-sparqlgeo-and-linked-data/</link>
		<comments>http://webr3.org/blog/linked-data/virtuoso-6-sparqlgeo-and-linked-data/#comments</comments>
		<pubDate>Wed, 03 Feb 2010 23:24:40 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[linked data]]></category>
		<category><![CDATA[virtuoso]]></category>
		<category><![CDATA[Computing]]></category>
		<category><![CDATA[Edinburgh]]></category>
		<category><![CDATA[Filter]]></category>
		<category><![CDATA[FOAF]]></category>
		<category><![CDATA[Group action]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[London]]></category>
		<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[New York City]]></category>
		<category><![CDATA[Oxford]]></category>
		<category><![CDATA[RDBMS]]></category>
		<category><![CDATA[rdf]]></category>
		<category><![CDATA[RDF Schema]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[SPARQL]]></category>
		<category><![CDATA[text search]]></category>
		<category><![CDATA[United Kingdom]]></category>
		<category><![CDATA[World Wide Web]]></category>
		<category><![CDATA[XML]]></category>
		<category><![CDATA[York]]></category>

		<guid isPermaLink="false">http://webr3.org/blog/?p=183</guid>
		<description><![CDATA[Along side a whole host of improvements, the latest version of Virtuoso (Virtuoso 6) has added support for Geo data! One small sentence, one huge leap for mankind; it's vastly importany IMHO because it brings a new kind of link to Linked Data; a location based one.
Very brief intro: SPARQL is a fantastic query language [...]]]></description>
			<content:encoded><![CDATA[<p>Along side a whole host of improvements, the latest version of Virtuoso (<a href="http://bit.ly/dgbAXS">Virtuoso 6</a>) has added support for Geo data! One small sentence, one huge leap for mankind; it's vastly importany IMHO because it brings a new kind of link to Linked Data; a location based one.</p>
<p>Very brief intro: SPARQL is a fantastic query language which works over RDF and thus Linked Data, Virtuoso amongst other things has a powerful QuadStore which can be queried via SPARQL, and Virtuoso's implementation of SPARQL + the extensive suite of extensions they have implemented makes it the most usable and powerful query langauge available (again, in my honest opinion). In short this combination was enough to make me drop normal RDBMS systems and never look back.</p>
<p>Rather than rambling on about how fantastic it is though; here are some Virtuoso specific sample SPARQL (+GEO) queries, which should hopefully wet your appetite and give you some inclination of what can be done.</p>
<h2>Basic Geo Lookups</h2>
<p><strong>Things within 20km of New York City : <a href="http://bit.ly/9IBiVW" target="_blank">RESULTS</a></strong><br />
<code>  SELECT DISTINCT ?resource ?label ?location<br />
  WHERE<br />
  {<br />
    &lt;http://dbpedia.org/resource/New_York_City> geo:geometry ?sourceGEO .<br />
    ?resource geo:geometry ?location ; rdfs:label ?label .<br />
    FILTER( bif:st_intersects( ?location, ?sourceGEO, 20 ) ) .<br />
    FILTER( lang(?label) = "en" )<br />
  }</code></p>
<p><strong>Distance between New York City and London, England : <a href="http://bit.ly/bYNfWO" target="_blank">RESULTS</a></strong><br />
<code>  SELECT (bif:st_distance(?nyl,?ll)) as ?distanceBetweenNewYorkCityAndLondon<br />
  WHERE<br />
  {<br />
    &lt;http://dbpedia.org/resource/New_York_City> geo:geometry ?nyl .<br />
    &lt;http://dbpedia.org/resource/London> geo:geometry ?ll .<br />
  }<br />
 </code></p>
<h2>Querying Time and Space</h2>
<p><strong>All Educational Institutions within 10km of Oxford, UK; ordered by date of establishment : <a href="http://bit.ly/biZEHA" target="_blank">RESULTS</a></strong><br />
<code>SELECT DISTINCT ?thing as ?uri ?thingLabel as ?name ?date as ?established ?matchGEO as ?location<br />
WHERE<br />
{<br />
&lt;http://dbpedia.org/resource/Oxford&gt; geo:geometry ?sourceGEO .<br />
?resource geo:geometry ?matchGEO .<br />
FILTER( bif:st_intersects( ?matchGEO, ?sourceGEO, 5 ) ) .<br />
?thing ?somelink ?resource ; &lt;http://dbpedia.org/ontology/established&gt; ?date ; rdfs:label ?thingLabel . FILTER( lang(?thingLabel) = "en" )<br />
} ORDER BY asc( ?date )<br />
</code><br />
<strong>Historical cross section of events related to Edinburgh and the surrounding area (within 30km) during the 19th century : <a href="http://bit.ly/dfZU43" target="_blank">RESULTS</a></strong><br />
<code>SELECT DISTINCT ?thing ?thingLabel ?dateMeaningLabel ?date ?matchGEO WHERE {<br />
{<br />
SELECT DISTINCT ?thing ?matchGEO<br />
WHERE<br />
{<br />
&lt;http://dbpedia.org/resource/Edinburgh&gt; geo:geometry ?sourceGEO .<br />
?resource geo:geometry ?matchGEO .<br />
FILTER( bif:st_intersects( ?matchGEO, ?sourceGEO, 30 ) ) .<br />
?thing ?somelink ?resource<br />
}<br />
}<br />
{?property rdf:type owl:DatatypeProperty ; rdfs:range xsd:date } .<br />
?thing ?dateMeaning ?date . FILTER( ?dateMeaning in( ?property ) ) . FILTER( ?date &gt;= xsd:gYear("1800") &amp;&amp; ?date &lt;= xsd:gYear("1900") )<br />
?dateMeaning rdfs:label ?dateMeaningLabel . FILTER( lang(?dateMeaningLabel) = "en" ) .<br />
?thing rdfs:label ?thingLabel . FILTER( lang(?thingLabel) = "en" )<br />
} ORDER BY asc( ?date )</code></p>
<h2>Transitivity and Inference (v5 compatible)</h2>
<p><strong>Finding the shortest route between two "things" (HTML and XML in the example) : <a href="http://bit.ly/cJjsBL" target="_blank">RESULTS</a></strong><br />
<code>SELECT ?route ?jump WHERE<br />
{<br />
 { SELECT ?x ?y WHERE { ?x foaf:page ?xpage ; ?predicate ?y . filter( isURI(?y) ) } }<br />
 OPTION ( TRANSITIVE, T_DISTINCT, T_SHORTEST_ONLY, t_in(?x), t_out(?y), t_max(10), t_step('path_id') as ?path, t_step(?x) as ?route, t_step('step_no') AS ?jump )<br />
 . FILTER ( ?y = &lt;http://dbpedia.org/resource/HTML> &#038;& ?x = &lt;http://dbpedia.org/resource/XML> )<br />
}<br />
</code></p>
<p><strong>..and all routes between the two "things" : <a href="http://bit.ly/cQV4AW" target="_blank">RESULTS</a></strong><br />
<code>SELECT ?route ?path ?jump WHERE<br />
{<br />
 { SELECT ?x ?y WHERE { ?x foaf:page ?xpage ; ?predicate ?y . filter( isURI(?y) ) } }<br />
 OPTION ( TRANSITIVE, T_NO_CYCLES, t_in(?x), t_out(?y), t_max(5), t_step('path_id') as ?path, t_step(?x) as ?route, t_step('step_no') AS ?jump )<br />
 . FILTER ( ?y = &lt;http://dbpedia.org/resource/HTML> &#038;& ?x = &lt;http://dbpedia.org/resource/XML> )<br />
}</code></p>
<p><strong>Traversing Ontologies and (Sub)Classes; all subclasses of Person down the hierarchy  : <a href="http://bit.ly/aZ0oOM">RESULTS</a></strong><br />
<code>SELECT DISTINCT ?x WHERE<br />
{<br />
 { SELECT ?x ?y WHERE { ?x rdfs:subClassOf ?y } }<br />
 OPTION ( TRANSITIVE, T_DISTINCT, t_in(?x), t_out(?y), t_step('path_id') as ?path, t_step(?x) as ?route, t_step('step_no') AS ?jump, T_DIRECTION 2 )<br />
 FILTER ( ?y = &lt;http://dbpedia.org/ontology/Person> )<br />
}</code></p>
<h2>Free text search, scores and IRI Ranks (v5 compatible)</h2>
<p><strong>Searching over labels, with text match scores and additional ranks for each iri / resource  : <a href="http://bit.ly/bMNweO">RESULTS</a></strong><br />
<code>SELECT ?s ?page ?label ?textScore (<LONG::IRI_RANK>(?s)) as ?iriRank WHERE {<br />
  ?s foaf:page ?page ; rdfs:label ?label . FILTER( lang(?label) = "en" ) .<br />
  ?label bif:contains 'adobe and flash' option (score ?textScore ) .<br />
}</code></p>
<p><strong>Virtuoso 6.1 (Open Source Edition) released. For features &#038; bug fix details see: <a href="http://bit.ly/dgbAXS">link</a></strong></p>
<p><img src="http://webr3.org/blog/wp-content/uploads/2010/02/spo.jpg" alt="spo" title="spo" width="600" height="250" class="alignnone size-full wp-image-226" /></p>
]]></content:encoded>
			<wfw:commentRss>http://webr3.org/blog/linked-data/virtuoso-6-sparqlgeo-and-linked-data/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Preparing Yourself for Web 3.0, LOD and 2010+</title>
		<link>http://webr3.org/blog/featured/preparing-yourself-for-web-3-0-lod-and-2010/</link>
		<comments>http://webr3.org/blog/featured/preparing-yourself-for-web-3-0-lod-and-2010/#comments</comments>
		<pubDate>Fri, 30 Oct 2009 00:08:51 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[RDFa]]></category>
		<category><![CDATA[featured]]></category>
		<category><![CDATA[general]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[linked data]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[author]]></category>
		<category><![CDATA[everyday Web Developer]]></category>
		<category><![CDATA[FOAF]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[London]]></category>
		<category><![CDATA[mentioned search engine traffic]]></category>
		<category><![CDATA[Open Data]]></category>
		<category><![CDATA[public facing web pages]]></category>
		<category><![CDATA[rdf]]></category>
		<category><![CDATA[same author]]></category>
		<category><![CDATA[search engine]]></category>
		<category><![CDATA[Technology/Internet]]></category>
		<category><![CDATA[United Kingdom]]></category>
		<category><![CDATA[Web 2.0]]></category>
		<category><![CDATA[Web Designer and SEO Specialist]]></category>
		<category><![CDATA[Web Developers]]></category>
		<category><![CDATA[XHTML]]></category>

		<guid isPermaLink="false">http://webr3.org/blog/?p=172</guid>
		<description><![CDATA[If you work on the net then you'll have probably heard of the "semantic web", it's nice, you can ignore it and get along just fine though; however "Linked Open Data" (LOD) is now upon us and it's one of these things that can't be ignored, no matter which sector of the internet you work [...]]]></description>
			<content:encoded><![CDATA[<p>If you work on the net then you'll have probably heard of the "semantic web", it's nice, you can ignore it and get along just fine though; however "Linked Open Data" (LOD) is now upon us and it's one of these things that can't be ignored, no matter which sector of the internet you work in, if you do ignore it you'll probably become extinct (career-wise) pretty soon.</p>
<p>Sounds melodramatic but the whole point of this text is to explain in real terms the effect it'll have on the every day web worker; the web developer, web designer, seo expert, internet marketer etc. So that you, my current or future friends and associates still have a job in a couple  of years; and I researched it so that I would still have a job in a few years (+ because I love this stuff!)</p>
<h2><strong>A bit about Linked Open Data (LOD).</strong></h2>
<p>LOD can easily be a huge, scary and new thing, overwhelming in so many ways with all this talk of a cloud, billions of bits of information in some part of the web that is separate to "us"; take one look at the diagrams of the linked open data cloud and you'll see those academic acronyms of scientific organisations, future thinking global entities publishing their specialised data - nothing about you and me with our little blogs, and moreover nothing about our clients websites.</p>
<p>Sure it's about getting massive amounts of data on the web, linked and open for use, but it's different to how you expect :)</p>
<p>Linked Open Data is simply about making the info we already put on the net (like this post) machine readable as well as human readable.</p>
<p>It *IS NOT* about creating some system to dump everything from our database in some weird format for a machine to read somewhere.</p>
<p>It *IS* about wrapping the data on a normal page in a bit of markup so that a computer knows what it is.</p>
<p>If you're writing about london you simply add a tiny bit of markup that says 'about="http://en.wikipedia.org/wiki/London"' - honestly that's it in real terms, the user reading your page knows its about London, England - and now a system like google knows that it's definitely about London, England too. In most cases though it's simpler; it's a case of saying this article is titled "x" and made by person "y" - that alone makes a huge difference to the net.</p>
<h2><strong>How LOD will change the web.</strong></h2>
<p>More Links! We're currently in the age of search, if you want something you search for it, to get more info you search again, and again and so on.</p>
<p>Link trust was at an all time low a few years ago, sure you'd click a navigation link on a site but not a link in a document, because it was probably to a popup, an advert, something you didn't want. Not now though, mainly thanks to bloggers with their in text links to other pages, the world has grown to trust the link again.</p>
<p>Linked Open Data will spawn a massive increase in related data on page, related resources, articles, images, videos and more. And thus many, many more links.</p>
<p>This means that people will search less, and explore more; ever increasingly.</p>
<p>It's unavoidable, and even if a website isn't enhanced with all this extra linked data, odds are the user will have a browser extension or app running that will show all the related information anyway - these technologies are already here and used - adoption *will* grow, no way out of it, change happens.</p>
<h2><strong>Info for specific sectors</strong></h2>
<p>This isn't the meant to be a full introduction or all encompassing, in fact nowhere near it - if you want the ins and outs of LOD then look elsewhere. This info is for the everyday Web Developer, Web Designer and SEO Specialist.</p>
<p><strong>Web Designers (+ those who work with html)</strong></p>
<p>To be honest I think this change might hit you guy's hardest; you see XHTML+RDFa is already here, it'll be massive soon (and don't go thinking HTML5 will get you out of it, RDFa will be in there too). In short XHTML+RDFa is xhtml as you know it, but with support for embedded RDF information, really it means a few new properties on elements that let you say what they are; in place FOAF, Dublin Core (DC) and the like. Any further description is outside the scope of this document ;)</p>
<p>What this means for you is that as well as having potentially a lot more to display on page (linked data) and lot's of UI challenges, you also now have to cater for this RDFa data in your templates. It's not like other W3C stuff which you can ignore, different to cross browser compatibility, if you leave it out or skip the RDFa stuff then the site will potentially be outside of the LOD network, traffic will drop and ultimately the site may as well not be "in" the web (might be a few years before that though) - so in many ways the end of ignorance and excuses.</p>
<p>You can currently slap out some HTML4, change the doctype, stick on jquery and make it "look" web 2.0 - and people will think it's web 2.0; with web 3.0 (the linked open data based net) you can't do that, it either is web 3.0 or isn't; there isn't a "web 3.0" look, just web 3.0 source.</p>
<p>Drupal 7 has RDFa support out of the box; within the year I bet every CMS &amp; Blog will too; and if you make a new template with the RDFa cut out because you "don't know it", then I'm pretty sure it won't be long before your clients or employers cut you out; and we don't want that.</p>
<p>Further, if you don't - developers will be on your back big time &amp; changing your source; or worse the SEO guys will be ;)</p>
<p><strong>Web Developers</strong><br />
All of you need to know what triples are (subject-predicate-object), and URIs and CURIEs (not your normal URIs, URIs as Identifiers).</p>
<p>If you're going to be exposing data in your systems then you need to get used to mapping database properties through to RDF triples; that a user is a foaf:person with a foaf:name; that tags are ctags and dc:subjects, and that articles have a dc:title (keeping it simple for this).</p>
<p>If you're going to be consuming LOD data then you need to learn a bit more, RDF, SPARQL, Owl, ontologies and a bit more.</p>
<p>And if you want to get "in to" LOD in a big way, then go do it.</p>
<p><strong>SEO Specialists</strong><br />
You need to know what the designers know, and you'll be changing from SEO specialists to data exploration optimizers or suchlike, focus will be on how you can make the data machine readable and get it linked in by the right services.. should be fun!</p>
<p>Further, you'll need to watch for how to get traffic to the sites, as mentioned search engine traffic will drop slowly over the next few months and years; with more focus going on "links" from related pages. As for the diggs &amp; reddits, who knows how it'll effect traffic from them.</p>
<h2><strong>Summary</strong></h2>
<p>IMHO it's in all of our best interests to just get on with this, it will happen and the sooner YOU do it and convince your employers you have to make this move the better, companies can easily loose clients too if your competition is offering "web 3" and you aren't.</p>
<p>At no point have I seen a tech hit the web which could literally leave people behind if they don't jump on board; it's happened in other industries and now ours (remember VHS?).</p>
<h2><strong>The two questions most people / companies / clients will immediately raise..</strong></h2>
<p><strong>1] We don't want to expose all our data for reasons X,Y&amp;Z!</strong></p>
<p>LOD isn't about exposing all the your data on the internet; it's about making the data you've already exposed on the internet in a more granular fashion, it's about making that data machine readable.</p>
<p>Presently you may have an article on a page with a title and author credit in HTML, in the future you would still have the same author and title, however they would be wrapped in markup that allows a machine to understand that "Joe Blogs" is a person who is the author of the article, and that the articles title is "I'm scared of exposing my data".</p>
<p>If you consider you're public facing web pages, everything on that page is already exposed, all we're doing here is describing what each bit of data is in a way we can all use.</p>
<p><strong>2] Trust &amp; Junk</strong></p>
<p>One common misconception is that you have no control over the source of the data you pull from the "cloud", and that it could essentially be junk. However this couldn't be further from the truth, what we do is to find a source of data we trust that has their data exposed in a machine readable format, then query it for the exact information we want, and finally include or display it in our own system.</p>
<p>To illustrate, consider you wanted to reference the countries of the world with population in your system. Currently you would have to build  a database table, populate that data with country name and country population, then write some code to display that data. In this scenario you'd probably get the population data from a credible source such as wikipedia (copy and paste it in to your own database).</p>
<p>By using linked open data, you could treat the machine readable version of wikipedia (dbpedia) as your database table, query it instead and again write some code to display the data on you're own site.</p>
<p>You're displaying the same data, from the same trusted source; and you've selected which source you trust; it's not a case of just querying some cloud of data; it's a case of choosing which source(s) you want / trust and querying them.</p>
<p>As an additional bonus you don't need to worry about your information going out of date, as you're getting the data straight from source, the population of each country is updated on your site whenever it's updated on wikipedia.</p>
<p>Further, you don't need to worry about maintaining that list of countries, as in a single query you can pull out a list of all countries with each ones population, as the world grows and changes, so does your data.</p>
<p>Further still! once you've made the move to using some linked open data, all the data you could want is at your finger tips, let's say a decision is made to include 30 different bits of information about each country in your system. Consider that task for a minute - full system change, finding, collating and entering all that data; let alone maintaining it! Well, I'm sure you can guess the next bit, using LOD we can simply expand our original query to include the other bits of information we want, then display it - job done.</p>
<p><strong>That's it.</strong></p>
<p>Good Luck!</p>
<p>nathan</p>
<p><img class="alignnone size-full wp-image-173" title="future" src="http://webr3.org/blog/wp-content/uploads/2009/10/future.jpg" alt="future" width="600" height="250" /></p>
]]></content:encoded>
			<wfw:commentRss>http://webr3.org/blog/featured/preparing-yourself-for-web-3-0-lod-and-2010/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>The end of Search? Linked Data, Semantic Web &amp; thoughts.</title>
		<link>http://webr3.org/blog/semantic-web/the-end-of-search-linked-data-semantic-web-and-my-vision/</link>
		<comments>http://webr3.org/blog/semantic-web/the-end-of-search-linked-data-semantic-web-and-my-vision/#comments</comments>
		<pubDate>Mon, 26 Oct 2009 23:03:09 +0000</pubDate>
		<dc:creator>nathan</dc:creator>
				<category><![CDATA[linked data]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[API]]></category>
		<category><![CDATA[caching]]></category>
		<category><![CDATA[DBpedia]]></category>
		<category><![CDATA[FOAF]]></category>
		<category><![CDATA[Georgi Kobilarov]]></category>
		<category><![CDATA[Globally Unique Identifier]]></category>
		<category><![CDATA[HTML]]></category>
		<category><![CDATA[rdf]]></category>
		<category><![CDATA[RDFa]]></category>
		<category><![CDATA[Resource]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[software side]]></category>
		<category><![CDATA[Technology/Internet]]></category>
		<category><![CDATA[Uniform Resource Identifier]]></category>
		<category><![CDATA[Virtuoso Universal Server]]></category>

		<guid isPermaLink="false">http://webr3.org/blog/?p=164</guid>
		<description><![CDATA[Earlier today I was reading an interesting post by Georgi Kobilarov entitled "What’s wrong with the Linked Data world, part 1 - Keyword Search"; this particularly sparked my interest because in all honesty "search" had never came in to my vision of the semantic web / linked data world.
To me, the draw of linked data [...]]]></description>
			<content:encoded><![CDATA[<p>Earlier today I was reading an interesting post by <a href="http://www.georgikobilarov.com/">Georgi Kobilarov</a> entitled "<a href="http://blog.georgikobilarov.com/2009/10/whats-wrong-with-the-linked-data-world-part-1-keyword-search/">What’s wrong with the Linked Data world, part 1 - Keyword Search</a>"; this particularly sparked my interest because in all honesty "search" had never came in to my vision of the semantic web / linked data world.</p>
<p>To me, the draw of linked data and the semantic web has always been exploration; the notion that even the most unskilled of publishers should be able to enrich their content via semi-automated software to the standards of a near perfect wikipedia article has always been the driving force. Additionally, content classification, relation, linkage, data centralization and the like are all major benefits which will make a vast difference to the usability of the web.</p>
<p>Search will always be a major part of the internet, at the moment we use search to find content on a specific subject, then search again to find more, and search again to find related or expanded info, help, facts, answers, whatever; however, in the future I hope to see search move to a less prominent role, one where we use search to find the most suitable "entry point" in to the web of linked data - and from there every other piece of related / expanded information is either on page, or a click (link) away.</p>
<p>Some major hurdles need to be jumped before we can get to that stage though, both through lack of organization and lack of appropriate software. Personally I have a mental blueprint / overview of what's needed (imho), and some very specific ideas on the software side, with any luck I'll get a chance to contribute + build some of this, we'll see.</p>
<p>Some thoughts of what's needed from my little brain.</p>
<p><strong>Linked Data Ping</strong><br />
A central service API which is pinged by all software as it publishes information with machine accessible content. (Needed way before (x)HTML+RDFa takes off). Provides a stream of all recent pings to be consumed (xmpp pub-sub?).</p>
<p><strong>Clustered Servers holding a centralized data GUID lookup and proxy.</strong><br />
In essence all resources on the net should be a linked pair of GUID to endpoint, each endpoint should contain a reference to the GUID, and each GUID should be a URI which redirects to the endpoint, endpoints change GUID/URI stays unique. In an ideal scenario when somebody creates a link to X resource or Y document, the publishing/controlling software should replace the endpoint with the GUID instead. This would also enable multiple other services such as centralized pingback, references, statistics etc.</p>
<p><strong>Machine Readable Data Cache.</strong><br />
Together with the aforementioned services a high availability database of cache'd information should exist; in principal this would work by reading the stream of "Linked Data Pings", getting the GUID for the content and then retrieving all machine readable data and caching it. Much like the RDF data exposed through dbpedia, however for everything. Even if only a predefined subset of the common rdf vocabularies was stored and exposed it'd be enough to start, from there all other domain specific ontology could be retrieved by reading the endpoint itself.</p>
<p><strong>Semantic CMS</strong><br />
Ideally we need a new breed of CMS, one that not only has simple FOAF and Dublin Core (~Drupal 7), but also support for full content enrichment using the aforementioned machine services; and provides a simple UI for manually exposing entities, events, facts etc. (Think highlight name in text, mark as Person with Name, system finds guid and builds relevant RDFa and we have another triple of linked data.)</p>
<p>The possibilities from this point are endless; if you're reading this document after all this has been made, then you'd see a whole host of in text links through to more information on each keyphrase, person, entity etc; you'd be aided by auto injection of sources, related reading, comments, further documents discussing the content here, in short you'd be exploring the net one click at a time, linked data all the way; not searching.</p>
<p>In summary (and very much imho), Linked Data is not for searching, it's for linking data - search was invented to address the issue that everything isn't linked, when it is then the link takes precedence again.</p>
<p>My only worry in all of this, is the idea that all rdf triples are fact, and true - already the major search engines are exposing rdfa data in summaries, 5* ratings on products and suchlike, the room for abuse will only get worse.</p>
<p>Thanks Georgi for placing the spark that clarified my current thoughts.</p>
<p>Finally, this isn't a biased opinion in anyway, or an endorsement, but to me openlink virtuoso, dbpedia, zemanta and open calais are leading the way and enabling all of this; together with the hard working folks contributing to the various linked W3C projects and specs. If only dbpedia/zemanta/calais would unify there uri's/guids/endpoints we'd be a lot further along.. (well I would ;).</p>
<p>Regards!</p>
<p><img class="alignnone size-full wp-image-167" title="linkeddata" src="http://webr3.org/blog/wp-content/uploads/2009/10/linkeddata.jpg" alt="linkeddata" width="600" height="250" /></p>
]]></content:encoded>
			<wfw:commentRss>http://webr3.org/blog/semantic-web/the-end-of-search-linked-data-semantic-web-and-my-vision/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

