schema-org1 About a month ago Version 2.0 of the Schema.org vocabulary hit the streets.

This update includes loads of tweaks, additions and fixes that can be found in the release information.  The automotive folks have got new vocabulary for describing Cars including useful properties such as numberofAirbags, fuelEfficiency, and knownVehicleDamages. New property mainEntityOfPage (and its inverse, mainEntity) provide the ability to tell the search engine crawlers which thing a web page is really about.  With new type ScreeningEvent to support movie/video screenings, and a gtin12 property for Product, amongst others there is much useful stuff in there.

But does this warrant the version number clicking over from 1.xx to 2.0?

These new types and properties are only the tip of the 2.0 iceberg.  There is a heck of a lot of other stuff going on in this release that apart from these additions.  Some of it in the vocabulary itself, some of it in the potential, documentation, supporting software, and organisational processes around it.

Sticking with the vocabulary for the moment, there has been a bit of cleanup around property names. As the vocabulary has grown organically since its release in 2011, inconsistencies and conflicts between different proposals have been introduced.  So part of the 2.0 effort has included some rationalisation.  For instance the Code type is being superseded by SoftwareSourceCode – the term code has many different meanings many of which have nothing to do with software; surface has been superseded by artworkSurface and area is being superseded by serviceArea, for similar reasons. Check out the release information for full details.  If you are using any of the superseded terms there is no need to panic as the original terms are still valid but with updated descriptions to indicate that they have been superseded.  However you are encouraged to moved towards the updated terminology as convenient.  The question of what is in which version brings me to an enhancement to the supporting documentation.  Starting with Version 2.0 there will be published a snapshot view of the full vocabulary – here is http://schema.org/version/2.0.  So if you want to refer to a term at a particular version you now can.

CreativeWork_usage How often is Schema being used? – is a question often asked. A new feature has been introduced to give you some indication.  Checkout the description of one of the newly introduced properties mainEntityOfPage and you will see the following: ‘Usage: Fewer than 10 domains‘.  Unsurprisingly for a newly introduced property, there is virtually no usage of it yet.  If you look at the description for the type this term is used with, CreativeWork, you will see ‘Usage: Between 250,000 and 500,000 domains‘.  Not a direct answer to the question, but a good and useful indication of the popularity of particular term across the web.

Extensions
In the release information you will find the following cryptic reference: ‘Fix to #429: Implementation of new extension system.’

This refers to the introduction of the functionality, on the Schema.org site, to host extensions to the core vocabulary.  The motivation for this new approach to extending is explained thus:

Schema.org provides a core, basic vocabulary for describing the kind of entities the most common web applications need. There is often a need for more specialized and/or deeper vocabularies, that build upon the core. The extension mechanisms facilitate the creation of such additional vocabularies.
With most extensions, we expect that some small frequently used set of terms will be in core schema.org, with a long tail of more specialized terms in the extension.

As yet there are no extensions published.  However, there are some on the way.

As Chair of the Schema Bib Extend W3C Community Group I have been closely involved with a proposal by the group for an initial bibliographic extension (bib.schema.org) to Schema.org.  The proposal includes new Types for Chapter, Collection, Agent, Atlas, Newspaper & Thesis, CreativeWork properties to describe the relationship between translations, plus types & properties to describe comics.  I am also following the proposal’s progress through the system – a bit of a learning exercise for everyone.  Hopefully I can share the news in the none too distant future that bib will be one of the first released extensions.

W3C Community Group for Schema.org
A subtle change in the way the vocabulary, it’s proposals, extensions and direction can be followed and contributed to has also taken place.  The creation of the Schema.org Community Group has now provided an open forum for this.

So is 2.0 a bit of a milestone?  Yes taking all things together I believe it is. I get the feeling that Schema.org is maturing into the kind of vocabulary supported by a professional community that will add confidence to those using it and recommending that others should.

Comment   or   Contact us

worldcat I am pleased to share with you a small but significant step on the Linked Data journey for WorldCat and the exposure of data from OCLC.

Content-negotiation has been implemented for the publication of Linked Data for WorldCat resources.

For those immersed in the publication and consumption of Linked Data, there is little more to say.  However I suspect there are a significant number of folks reading this who are wondering what the heck I am going on about.  It is a little bit techie but I will try to keep it as simple as possible.

Back last year, a linked data representation of each (of the 290+ million) WorldCat resources was embedded in it’s web page on the WorldCat site.  For full details check out that announcement but in summary:

  • All resource pages include Linked Data
  • Human visible under a Linked Data tab at the bottom of the page
  • Embedded as RDFa within the page html
  • Described using the Schema.org vocabulary
  • Released under an ODC-BY open data license

That is all still valid – so what’s new from now?

That same data is now available in several machine readable RDF serialisations. RDF is RDF, but dependant on your use it is easier to consume as RDFa, or XML, or JSON, or Turtle, or as triples.

In many Linked Data presentations, including some of mine, you will hear the line “As I clicked on the link a web browser we are seeing a html representation.  However if I was a machine I would be getting XML or another format back.”  This is the mechanism in the http protocol that makes that happen.

Let me take you through some simple steps to make this visible for those that are interested.

Pris Starting with a resource in WorldCat: http://www.worldcat.org/oclc/41266045. Clicking that link will take you to the page for Harry Potter and the prisoner of Azkaban.  As we did not indicate otherwise, the content-negotiation defaulted to returning the html web page.

To specify that we want RDF/XML we would specify http://www.worldcat.org/oclc/41266045.rdf   (dependant on your browser this may not display anything, but allow you to download the result to view in your favourite editor)

Screenshot_JSON For JSON specify http://www.worldcat.org/oclc/41266045.jsonld
For turtle specify http://www.worldcat.org/oclc/41266045.ttl
For triples specify http://www.worldcat.org/oclc/41266045.nt

This allows you to manually specify the serialisation format you require.  You can also do it from within a program by specifying, to the http protocol, the format that you would accept from accessing the URI.  This means that you do not have to write code to add the relevant suffix to each URI that you access.  You can replicate the effect by using curl, a command line http client tool:

curl -L -H “Accept: application/rdf+xml” http://www.worldcat.org/oclc/41266045
curl -L -H “Accept: application/ld+json” http://www.worldcat.org/oclc/41266045
curl -L -H “Accept: text/turtle” http://www.worldcat.org/oclc/41266045
curl -L -H “Accept: text/plain” http://www.worldcat.org/oclc/41266045

So, how can I use it?  However you like.

If you embed links to WorldCat resources in your linked data, the standard tools used to navigate around your data should now be able to automatically follow those links into and around WorldCat data. If you have the URI for a WorldCat resource, which you can create by prefixing an oclc number with ‘http://www.worldcat.org/oclc/’, you can use it in a program, browser plug-in, smartphone/facebook app to pull data back, in a format that you prefer, to work with or display.

Go have a play, I would love to hear how people use this.

Comment   or   Contact us

Day three of the Semantic Tech & Business Conference in San Francisco brought us a panel to discuss Schema.org, populated by an impressive array of names and organisations:

IMG_0306 Ivan Herman, World Wide Web Consortium
Alexander Shubin, Yandex
Dan Brickley, Schema.org at Google
Evan Sandhaus, New York Times Company
Jeffrey W. Preston, Disney Interactive Media Group
Peter Mika, Yahoo!
R.V. Guha, Google
Steve Macbeth, Microsoft

This well attended panel started with a bit of a crisis – the stage in the room was not large enough to seat all of the participants causing a quick call out for bar seats and much microphone passing.  Somewhat reflective of the crisis of concern about the announcement of Schema.org, immediately prior to last year’s event which precipitated the hurried arrangement of a birds of a feather session to settle fears and disquiet in the semantic community.

Asking a fellow audience member what they thought of this session, they replied that the wasn’t much new said.  In my opinion I think that is a symptom of good things happening around the initiative.  He was right in saying that there was nothing substantive said, but there were some interesting pieces that came out of what the participants had to say.  Guha indicated that Google were already seeing that 7-10% of pages crawled already contained Schema.org mark-up, surprising growth in such a short time.  Steve Macbeth confirmed that Microsoft were also seeing around 7%.

Another unexpected but interesting insight from Microsoft was that they are looking to use Schema.org mark-up as a way to pass data between applications in Windows 8.  All the search engine folks were playing it close when asked what they were actually using the structured data they were capturing from Schema.org mark-up – lots of talk about projects around better search algorithms and indexing.  Guha, indicated that the Schema.org data was not siloed inside Google.  As with any other data it was used across the organisation, including within the Google Knowledge Graph functionality.

Jeffrey Preston responded to a question about the tangible benefits of applying Schema.org mark-up by describing how kids searching for games on the Disney site were being directed more accurately to the game as against pages that referenced it.  Evan Sandhaus described how it enabled a far easier integration with a vendor who could access their article data without having to work with a specific API.  Guha spoke about a Veterans job search site was created with the Department of Defence as they could constrain their search only to sites which only included Schema.org mark-up and identified jobs as appropriate for Veterans.

In questions from the floor, the panel explained the best way of introducing schema extensions, using the IPTC rNews as an example – get industry consensus to provide a well formed proposal and then be prepared to be flexible.   All done via the W3C hosted Public Vocabs List.

All good progress in only a year!

Richard Wallis is Technology Evangelist at OCLC and Founder of Data Liberate

Comment   or   Contact us

GoogleBlueBalls Today’s Wall Street Journal gives us an insight in to the makeover underway in the Google search department.

Over the next few months, Google’s search engine will begin spitting out more than a list of blue Web links. It will also present more facts and direct answers to queries at the top of the search-results page.

They are going about this by developing the search engine [that] will better match search queries with a database containing hundreds of millions of “entities”—people, places and things—which the company has quietly amassed in the past two years.

The ‘amassing’ got a kick start in 2010 with the Metaweb acquisition that brought Freebase and it’s 12 Million entities into the Google fold.  This is now continuing with harvesting of html embedded, schema.org encoded, structured data that is starting to spread across the web.

The encouragement for webmasters and SEO folks to go to the trouble of inserting this information in to their html is the prospect of a better result display for their page – Rich Snippets.  A nice trade-off from Google – you embed the information we want/need for a better search and we will give you  better results.

The premise of what Google are are up to is that it will deliver better search.  Yes this should be true, however I would suggest that the major benefit to us mortal Googlers will be better results.  The search engine should appear to have greater intuition as to what we are looking for, but what we also should get is more information about the things that it finds for us.  This is the step-change.  We will be getting, in addition to web page links, information about things – the location, altitude, average temperature or salt content of a lake. Whereas today you would only get links to the lake’s visitors centre or a Wikipedia page.

Another example quoted in the article:

…people who search for a particular novelist like Ernest Hemingway could, under the new system, find a list of the author’s books they could browse through and information pages about other related authors or books, according to people familiar with the company’s plans. Presumably Google could suggest books to buy, too.

Many in the library community may note this with scepticism, and as being a too simplistic approach to something that they have been striving towards for for many years with only limited success.  I would say that they should be helping the search engine supplier(s) do this right and be part of the process.  There is great danger that, for better or worse, whatever Google does will make the library search interface irrelevant.

As an advocate for linked data, it is great to see the benefits of defining entities and describing the relationships between them being taken seriously.   I’m not sure I buy into the term ‘Semantic Search’ as a name for what will result.  I tend more towards ‘Semantic Discovery’ which is more descriptive of where the semantics kick in – in the relationship between a searched for thing and it’s attributes and other entities.  However I’ve been around far too long to get hung up about labels.

Whilst we are on the topic of labels, I am in danger of stepping in to the almost religious debate about the relative merits of microdata and RDFa as the encoding method for embedding the schema.org.  Google recognises both, both are ugly for humans to hand code, and web masters should not have to care.  Once the CMS suppliers get up to speed in supplying the modules to automatically embed this stuff, as per this Drupal module, they won’t have to care.

I welcome this.  Yet it is only a symptom of something much bigger and game-changing as I postulated last month A Data 7th Wave is Approaching.

Comment   or   Contact us

schema-org1The Web has been around for getting on for a couple of decades now, and massive industries have grown up around the magic of making it work for you and your organisation.  Some of it, it has to be said, can be considered snake-oil.  Much of it is the output of some of the best brains on the planet.  Where, on the hit parade of technological revolutions to influence mankind, the Web is placed is oft disputed, but it is definitely up there with fire, steam, electricity, computing, and of course the wheel.  Similar debates, are and will virtually rage, around the hit parade of web features that will in retrospect have been most influential – pick your favourites, http, XML, REST, Flash, RSS, SVG, the URL, the href, CSS, RDF – the list is a long one.

I have observed a pattern as each of the successful new enhancements to the web have been introduced, and then generally adopted.  Firstly there is a disconnect between the proponents of the new approach/technology/feature and the rest of us.  The former split their passions between focusing on the detailed application, rules, and syntax of it’s use and; broadcasting it’s worth to the world, not quite understanding why the web masses do not ‘get it’ and adopt it immediately.  This phase is then followed by one of post-hype disillusionment from the creators, especially when others start suggesting simplifications to their baby.  Also at this time back-room adoption by those who find it interesting, but are not evangelistic about it, starts to occur.  The real kick for the web comes from those back-room folks who just use this next thing to deliver stuff and solve problems in a better way.  It is the results of their work that the wider world starts to emulate, so that they can keep up with the pack and remain competitive.  Soon this new feature is adopted by the majority, because all the big boys are using it, and it becomes just part of the tool kit.

A great example of this was RSS.  Not a technological leap but a pragmatic mix of current techniques and technologies mixed in with some lateral thinking and a group of people agreeing to do it in ‘this way’ then sharing it with the world.  As you will see from the Wikipedia page on RSS, the syntax wars raged in the early days – I remember it well 0.9, 0.91, 1.0, 1.1, 2.0- 2.01, etc.  I also remember trying, not always with success, to convince people around me to use it, because it was so simple.  Looking back it is difficult to say exactly when it became mainstream, but this line from Wikipedia gives me a clue: In December 2005, the Microsoft Internet Explorer team and Microsoft Outlook team announced on their blogs that they were adopting the feed icon first used in the Mozilla Firefox browser. In February 2006, Opera Software followed suit.  From then on, the majority of consumers of RSS were not aware of what they were using and it became just one of the web technologies you use to get stuff done.

I am now seeing the pattern starting to repeat itself again, with structured and linked data.  Many, including me, have been evangelising the benefits of web friendly, structured, linked data for some time now – preaching to a crowd that has been slow in growing, but growing it is.   Serious benefit is now being gained by organisations adopting these techniques and technologies, as our selection of case studiesdemonstrate.  They are getting on with it, often with our help, using it to deliver stuff.  We haven’t hit the mainstream yet.  For instance, the SEO folks still need to get their head around the difference between content and data.

Something is stirring around the edge of the Semantic Web/Linked Data community  that has the potential to give structured web enabled data the kick towards mainstream that RSS got when Microsoft adopted the RSS logo and all that came with it.   That something is schema.org, an initiative backed by the heavyweights of the search engine world, Google, Yahoo, and Bing.  For the SEO and web developer folks, schema.org offers a simple attractive proposition – embed some structured data in your html and, via things like Google’s Rich Snippets, we will give you a value added display in our search results.  Result, happy web developers with their sites getting improve listing display.  Result, lots of structured data starting to be published by people that you would have had an impossible task in convincing that it would be a good idea to publish structured data on the web.

I was at Semtech in San Francisco in June, just after schema.org was launched and caused a bit of a stir.  They’ve over simplified the standards that we have been working on for years, dumbing down RDF, diluting the capability, with to small a set of attributes, etc., etc.  When you get under the skin of schema.org, you see that with support for RDFa and supporting RDFa 1.1 lite, they are not that far from the RDF/Linked Data community.

Schema.org should be welcomed as an enabler for getting loads more structured and linked data on the web.  Is their approach now perfect,? No.  Will it influence the development of Linked Data? Yes.  Will the introduction be messy? Yes.  Is it about more than just rich snippets?  Oh yes.  Do the webmasters care at the moment? No.

If you want a friendly insight in to what schema.org is about, I suggest a listen to this month’s Semantic Link podcast, with their guest from Google/schema.org Ramanathan V. Guha.

Now where have I seen that name before? – Oh yes, back on the Wikipedia RSS pageThe basic idea of restructuring information about websites goes back to as early as 1995, when Ramanathan V. Guha and others in Apple Computer’s Advanced Technology Group developed the Meta Content Framework.”  So it probably isn’t just me who is getting a feeling of Déjà vu.

This post was also published on the Talis Consulting Blog
Comment   or   Contact us

RDF Magnify Like many of my posts, this one comes from the threads of several disparate conversations coming together in my mind, in an almost astrological conjunction.

One thread stems from my recent Should SEO Focus in on Linked Data? post, in which I was concluding that the group, loosely described as the SEO community, could usefully focus in on the benefits of Linked Data in their quest to improve the business of the sites and organisations they support. Following the post I received an email looking for clarification of something I said.

I am interested in understanding better the allusion you make in this paragraph:

One of the major benefits of using RDFa is that it can encode the links to other sources, that is the heart of Linked Data principles and thus describe the relationships between things. It is early days with these technologies & initiatives. The search engine providers are still exploring the best way to exploit structured information embedded in and/or linked to from a page. The question is do you just take RDFa as a new way of embedding information in to a page for the search engines to pick up, or do you delve further in to the technology and see it as public visibility of an even more beneficial infrastructure for your data.

If the immediate use-case for RDFa (microdata, etc.) is search engine optimization, what is the “even more beneficial infrastructure”? If the holy grail is search engine visibility, rank, relevance and rich-results, what is the “even more”?

In reply I offered:

What I was trying to infer is that if you build your web presence on top of a Linked Data described dataset / way of thinking / platform, you get several potential benefits:

  • Follow-your-nose navigation
  • Flexible easier to maintain page structure
  • Value added data from external sources….
  • … therefore improved [user] value with less onerous cataloguing processes
  • Agile/flexible systems – easy to add/mix in new data
  • Lower cost of enhancement (eg. BBC added dinosaurs to the established Wildlife Finder with minimal effort)
  • In-built APIs [with very little extra effort] to allow others to access / build apps upon / use your data in innovative ways
  • As per the BBC a certain level of default SEO goodness
  • Easy to map, and therefore link, your categorisations to ones the engines do/may use (eg. Google are using MusicBrainz to help folks navigate around – if, say as the BBC do, you link your music categories to those of MusicBrainz you can share in that effect.

So what I am saying is that you can ‘just’ take RDFa as a dialect to send your stuff to the Google (in which case microdata/microformats could be equally as good), but then you will miss out on the potential benefits I describe.

From my point of view there are two holy grails (if that isn’t breaking the analogy 😉

  1. Get visibility and hence folks to hit your online resources.
  2. Provide the best experience/usefulness/value to them when they do.

Linked Data techniques and technologies, have great value for the data owners in the second of those, with the almost spin-off benefit of helping you with the first one.

The next thread was not a particular item but a general vibe, from several bits and pieces I read – that RDFa was confusing and difficult. This theme I detect was coming from those only looking at it from a ‘how do I encode my metadata for Google to grab it for it’s snippets’ point of view (and there is nothing wrong in that) or those trying to justify a ‘schema.org is the only show in town’ position. Coming at it from the first of those two points of view, I have some sympathy – those new to RDFa must feel like I do (with my basic understanding of html) when I peruse the contents of many a css file looking for clues as to the designer’s intention.

However I would make two comments. Firstly, a site surfacing lots of data and hence wanting to encode RDFa amongst the human-readable stuff, will almost certainly be using tools to format the data as it is extracted from an underlying data source – it is those tools that should be evolved to produce the RDFa as a by-product. Secondly, it is the wider benefits of Linked Data, which I’m trying to promote in my posts, that justify people investing in time to focus on it. The fact that you may use RDFa to surface that data embedded in html, so that search engines can pick it up, is implementation detail – important detail, but missing the point if that is all you focus upon.

Thread number three, is the overhype of the Semantic Web. Someone who I won’t name, but I’m sure won’t mind me quoting, suggested the following as the introduction to a bit of marketing: The Semantic Web is here and creating new opportunities to revamp and build your business.

The Semantic Web is not here yet, and won’t be for some while. However what is here, and is creating opportunities, is Linked Data and the pragmatic application of techniques, technologies and standards that are enabling the evolution towards an eventual Semantic Web.

This hyped approach is a consequence of the stance of some in the Semantic Web community who with fervour have been promoting it’s coming, in it’s AI entirety, for several years and fail to understand why all of us, [enthusiasts, researchers, governments, commerce and industry] are not implementing all of it’s facets now. If you have the inclination, you can see some of the arguments playing out now in this thread on a SemWeb email list where Juan Sequeda asks for support for his SXSW panel topic suggestion.

A simple request, that I support, but the thread it created shows that the ‘eating the whole elephant’ of the Semantic Web will be too much to introduce it successfully to the broad Web, SEO, SERP, community and the ‘one mouthful at a time’ approach may have better chance of success. Also any talk of a ‘killer app’ is futile – we are talking about infrastructure here. What is the killer app feature of the Web? You could say linked, globally distributed, consistently accessed documents; an infrastructure that facilitated the development of several killer businesses and business models. We will see the same when we look back on a web enriched by linked, globally distributed, consistently accessed data.

So what is my astrological conjunction telling me? There is definitely fertile ground to be explored between the Semantic Web and the Web in the area of the pragmatic application of Linked Data techniques and technologies. People in both camps need to open their minds to the motivations and vision of the other. There is potential to be realised, but we are definitely not in silver bullet territory.

As I said in my previous post, I would love to explore this further with folks from the world of SEO & SERP. If you want to talk through what I have described, I encourage you to drop me an email or comment on this post.

This post was also published on the Talis Consulting Blog
Comment   or   Contact us

RDF MagnifyIt is well known, the business of SEO is all about influencing SERPs, or is it?  Let me open up those acronyms:

Those engaged in the business of Search Engine Optimisation (SEO) focus much of their efforts on influencing Search Engine Result Pages (SERP), or more specifically the relevance and representation of their targeted items upon those pages.  As many a guide to SEO will tell you, some of this is simple – understanding the basics of how search engines operate, or even just purchasing the right advertising links on the SERP.  Quite simple in objective, but in reality an art form that attracts high rewards for those that are successful at it.

So if you want to promote links on search engine pages to your products, why would you be interested in Linked Data?  Well there are a couple of impacts that Linked Data, and RDF its data format, can have that are well worth looking into.

Delivering the Links – the BBC Wildlife Finder site is an excellent example of the delivering the links effect.

The BBC started with the data describing their video and audio clips and relating them to the animals they portray.  What was innovative in their approach was that they then linked to other information resources on the web, as against creating a catalogue of all that information in a database of their own.  This they encoded using Linked Data techniques, using RDF and a basic Wildlife Ontology that Talis consultants helped them develop and publish.   The stunningly visual website was then built on top of that RDF data, providing an intuitive navigational experience for users, delivering the follow-your-nose capability [that characterise Linked Data backed websites] to naturally move your focus between, animals, species, habitats, behaviours and xthe animals that relate to them.  Each of these pages having its own permanent web address (URI).  In a second innovative step they provided links to those external resources (eg. Wikipeadia – via dbpeadia, Animal Diversity Web, ARKive) on their pages to enable you to explore further.  duck pagecurlIn yet another innovation, they make that RDF data openly and easily available for each of the main pages.  (Checkout the source of the page you get when you add .rdf to the end of the URL for an animal page – not pretty, but machines love it)

So a stunning Linked Data backed site, with intuitive follow-your-nose internal navigation and links to external sites, but how is this good for SEO?  Because it behaves like a good website should.  The logical internal interlinks between pages, with a good URI structure that are not hidden in the depths of an obscure hierarchy, coupled with links out to to relevant, well respected [in SEO terms] pages is just what search engines look for.  The results are self evident – search for Lions, Badgers, Mallard Duck and many other animals on your favourite search engine and you will find BBC Nature appearing high in the results set.

Featured Entries – Getting your entry on the first SERP a user sees is obviously the prime objective of SEO, however making it stand out from the other entries on that page is an obvious secondary one.  The fact that ebay charges more for listing enhancements indicates there is value in listing promotion.

RDF, in the form of RDFa, and Linked Data become important in the field of Search Engine Results Promotion (another use of SERP) courtesy of something called Rich Snippets supported by Google, Microsoft, and Yahoo.  From Google:

Google tries to present users with the most useful and informative search results. The more information a search result snippet can provide, the easier it is for users to decide whether that page is relevant to their search. With rich snippets, webmasters with sites containing structured content—such as review sites or business listings—can label their content to make it clear that each labeled piece of text represents a certain type of data: for example, a restaurant name, an address, or a rating.

Encoding structured information about your product, review or business in [the html embeddable version of RDF] RDFa gives the search engine more information to display, that it otherwise would not be able to reliably infer by analysing the text on the page.   Take a look at these results for an item of furniture – see how the result with the reviews, from sears.com, stands out:

x

Elements such as pricing, availability, are also presented if you encode them in to your page.  I would be leading you astray if I gave you the impression that RDFa was the only way of encoding such information within your html.  Microformats, and Microdata now being boosted by the schema.org initiative, are other ways of encoding structured information on to your pages that the engines will recognise.

One of the major benefits of using RDFa is that it can encode the links to other sources, that is the heart of Linked Data principles and thus describe the relationships between things.  It is early days with these technologies & initiatives.  The search engine providers are still exploring the best way to exploit structured information embedded in and/or linked to from a page.   The question is do you just take RDFa as a new way of embedding information in to a page for the search engines to pick up, or do you delve further in to the technology and see it as public visibility of an even more beneficial infrastructure for your data.

At Talis we know the power of Linked Data and it’s ability to both liberate and draw in value to your data.  We have experience with it [in SEO terms] delivering the links and have an understanding of its potential for link featuring.

I would love to explore this further with folks from the world of SEO & SERP.  I also work alongside a team eager to investigate the possibilities with innovative organisations wanting to learn from the experience of the BBC, Best Buy, Sears and other first movers, and take things further.  If you fit either of those profiles, or just want to talk through what I have described, I encourage you to drop me an email or comment on this post.  There is much more to this than is currently being exploited and to answer the question in the title of this post – yes, those interested in SEO should be focusing in on Linked Data.

This post was also published on the Talis Consulting Blog
Comment   or   Contact us