Linked Data a Recipe for Food?

What relevance does Linked Data have for a City’s food supply you may ask. “We live in a world where the agri-food supply chain, from producer all the way through to final consumer, is extremely inefficient in the flow of knowledge.. ..with the application of Semantic Web and Linked Data technologies along the food supply chain, it will make it easier for all actors along there to know more about where their food comes from and where their food goes.”

new_optimist_logoI’ve just watched a short interview with Dr Chris Brewster, of Aston Business School.  Chris is a Semantic Web and Linked Data specialist, but he was attending an event organised by the New Optimists Forum to look at Food & Cities – possible futures for Birmingham in 2050.

What relevance does Linked Data have for a City’s food supply you may ask.  As Chris put it:

We live in a world where the agri-food supply chain, from producer all the way through to final consumer, is extremely inefficient in the flow of knowledge.  It is very good at delivering food to your table but we don’t know where it comes from, what it’s history is.  That has great implication in various scenarios, for example when there are food emergencies, e colli, things of that sort.

My vision is with the application of Semantic Web and Linked Data technologies along the food supply chain, it will make it easier for all actors along there to know more about where their food comes from and where their food goes.  This will also create opportunities for new business models where local food will be more easily integrated in to the overall consumption patters of local communities.

This is a vision that can be applied to many of our traditional supply chains.  Industries have become very efficient at producing, building, and delivering often very complex things to customers and consumers, but only the thing travels along the chain, it is not accompanied by information about the thing, other than what you may find on a label or delivery note.  These supply chains are highly tuned processes that the logisticians will tell you have had most every drop of efficiency squeezed out of them already.  Information about all aspects of and steps within a chain could possibly allow parts of the chain to react, and possibly apply some local agility, feedback, and previously hidden efficiencies.

Another example of a traditional chain that exhibits an, on the surface, poor information supply chain is sustainable wood supply.  As covered by the BBC You&Yours radio program today (about 43 minutes in), coincidentally within minutes of me watching Dr Brewster.

fsc-logo3The Forest Stewardship Council has had a problem where one of their producers had part of their license revoked but apparently still applied the FSC label on the wood they were shipping.  Some of this wood travelled through the supply chain and was unwittingly displayed on UK retailers shelves as certified sustainable wood.  Listening to the FSC representative it was clear that if an integrated information supply network had been available, the chances of this happening would have been decreased, or at least it being identified sooner.

All very well, but why Linked Data?

One of the characteristics of supply chains it that they tend to deal with many differing organisations engaged in many differing processes – cultivation, packing, assembly, manufacture, shipping, distribution, retailing, etc.  Traditionally the computerisation of information flow between differing organisations and their differing systems and procedures has been a difficult nut to crack.  Getting all the players to agree and conform is an almost impossible task.  One of the many benefits of Linked Data is the ability to extract data from disparate sources describing different things and aggregate them together.  Yes you need some loose coordination between the parties around, identification of concepts etc., but you do not need to enforce a regimented vanilla system everywhere.

The automotive industry have already hooked in on this to address the problem of disseminating the mass of information around models of cars and their options.  There was a great panel on the 2nd day of the Semantic Tech and Business Conference in Berlin last month:

My takeaway from the panel: Here is an industry that pragmatically picked up a technology that can not only solve it’s problems but also can enable it to take innovative steps, not only for individual company competitive advantage but also to move the industry forward in it’s dealings with its supply/value chain and customers.  However, they are also looking more broadly and openly to for instance make data publicly available which will enhance the used car market.

So back to food.  The local food part of Dr Brewster’s new business model vision stems from the fact it should easier for a local producer to broadcast availability of their produce to the world.  Similarly, it should be easier for a retailer to tune in to that information in an agile way and not only become aware of the produce but also be linked to information about the supplier.

kasabi-foodFood and Linked Data is also something the team at Kasabi have been focussing in on recently.  Because of the Linked Data infrastructure underpinning the Kasabi Data Marketplace, they have been able to produce an aggregate Food dataset initially from BBC and Foodista.

As the dataset is updated, the Data Team will broaden the sources of food data, and increase the data quality for those passionate about food. They’ll be adding resources and improving the links between them to include things like: chefs, diets, seasonality information, and more.

Food aims to answer questions such as:

  • I fancy cooking something with “X”, but I don’t like “Y” what shall I cook?
  • I am pregnant and vegan, what should I prepare for dinner?

Ambitiously, it could also provide data to be used to aid the invention of new recipes based on the co-occurrence of ingredients.

Answering questions like how can I create something new from what I have is one of those difficult to measure yet nevertheless very apparent benefits of using Linked Data techniques and technologies.

It is very easy to imagine the [Linked Data] enhanced food supply chain of Chris’ vision integrated/aggregated with an evolved Kasabi Food dataset answering questions such as “what can I make for dinner which is wheat-free, contains ingredients grown locally that are available from the local major-chain-supermarket which has disabled parking bays?”.

A bit utopian I know, but what differs today from the similar visions that accompanied Tim Berners-Lees original Semantic Web descriptions is that folks like those in the automotive industry and at Kasabi are demonstrating bits of it already.

Bee on plate image from Kasabi.
Declaration I am a shareholder of Kasabi parent company Talis.

Is Linked Data DIY a Good Idea?

Rocket_Science Most Semantic Web and Linked Data enthusiasts will tell you that Linked Data is not rocket science, and it is not.  They will tell you that RDF is one of the simplest data forms for describing things, and they are right.  They will tell you that adopting Linked Data makes merging disparate datasets much easier to do, and it does. They will say that publishing persistent globally addressable URIs (identifiers) for your things and concepts will make it easier for others to reference and share them, it will.  They will tell you that it will enable you to add value to your data by linking to and drawing in data from the Linked Open Data Cloud, and they are right on that too.  Linked Data technology, they will say, is easy to get hold of either by downloading open source or from the cloud, yup just go ahead and use it.  They will make you aware of an ever increasing number of tools to extract your current data and transform it into RDF, no problem there then.

So would I recommend a self-taught do-it-yourself approach to adopting Linked Data?  For an enthusiastic individual, maybe.  For a company or organisation wanting to get to know and then identify the potential benefits, no I would not.  Does this mean I recommend outsourcing all things Linked Data to a third party – definitely not.

Let me explain this apparent contradiction.  I believe that anyone having, or could benefit from consuming, significant amounts of data, can realise benefits by adopting Linked Data techniques and technologies.  These benefits could be in the form of efficiencies, data enrichment, new insights, SEO benefits, or even business models.  Gaining the full effects of these benefits will only come from not only adopting the technologies but also adopting the different way of thinking, often called open-world thinking, that comes from understanding the Linked Data approach in your context.  That change of thinking, and the agility it also brings, will only embed in your organisation if you do-it-yourself.  However, I do council care in the way you approach gaining this understanding.

bike_girl A young child wishing to keep up with her friends by migrating from tricycle to bicycle may have a go herself, but may well give up after the third grazed knee.  The helpful, if out of breath, dad jogging along behind providing a stabilising hand, helpful guidance, encouragement, and warnings to stay on the side of the road, will result in a far less painful and rewarding experience.

I am aware of computer/business professionals who are not aware of what Linked Data is, or the benefits it could provide. There are others who have looked at it, do not see how it could be better, but do see potential grazed knees if they go down that path.  And there yet others who have had a go, but without a steadying hand to guide them, and end up still not getting it.

You want to understand how Linked Data could benefit your organisation?  Get some help to relate the benefits to your issues, challenges and opportunities.  Don’t go off to a third party and get them to implement something for you.  Bring in a steadying hand, encouragement, and guidance to stay on track.  Don’t go off and purchase expensive hardware and software to help you explore the benefits of Linked Data.  There are plenty of open source stores, or even better just sign up to a cloud based service such as Kasabi.  Get your head around what you have, how you are going to publish and link it, and what the usage might be.  Then you can size and specify the technology and/or service you need to support it.

So back to my original question – Is Linked Data DIY a good idea?  Yes it is. It is the only way to reap the ‘different way of thinking’ benefits that accompany understanding the application of Linked data in your organisation.  However, I would not recommend a do-it-yourself introduction to this.  Get yourself a steadying hand.

Is that last statement a thinly veiled pitch for my services – of course it is, but that should not dilute my advice to get some help when you start, even if it is not from me.

Picture of girl learning to ride from zsoltika on Flickr.
Source of cartoon unknown.

A Data 7th Wave Approaching

I believe Data, or more precisely changes in how we create, consume, and interact with data, has the potential to deliver a seventh wave impact. With the advent of many data associated advances, variously labelled Big Data, Social Networking, Open Data, Cloud Services, Linked Data, Microformats, Microdata, Semantic Web, Enterprise Data, it is now venturing beyond those closed systems into the wider world. It is precisely because these trends have been around for a while, and are starting to mature and influence each other, that they are building to form something really significant.

4405831072_3c769de659_b Some in the surfing community will tell you that every seventh wave is a big one.  I am getting the feeling, in the world of Web, that a number seven is up next and this one is all about data. The last seventh wave was the Web itself.  Because of that, it is a little constraining to talk about this next one only effecting the world of the Web.  This one has the potential to shift some significant rocks around on all our beaches and change the way we all interact and think about the world around us.

Sticking with the seashore metaphor for a short while longer; waves from the technology ocean have the potential to wash into the bays and coves of interest on the coast of human endeavour and rearrange the pebbles on our beaches.  Some do not reach every cove, and/or only have minor impact, however some really big waves reach in everywhere to churn up the sand and rocks, significantly changing the way we do things and ultimately think about the word around us.  The post Web technology waves have brought smaller yet important influences such as ecommerce, social networking, and streaming.

I believe Data, or more precisely changes in how we create, consume, and interact with data, has the potential to deliver a seventh wave impact.  Enough of the grandiose metaphors and down to business.

Data has been around for centuries, from clay tablets to little cataloguing tags on the end of scrolls in ancient libraries, and on into computerised databases that we have been accumulating since the 1960’s.  Up until very recently these [digital] data have been closed – constrained by the systems that used them, only exposed to the wider world via user interfaces and possibly a task/product specific API.  With the advent of many data associated advances, variously labelled Big Data, Social Networking, Open Data, Cloud Services, Linked Data, Microformats, Microdata, Semantic Web, Enterprise Data, it is now venturing beyond those closed systems into the wider world.

Well this is nothing new, you might say, these trends have been around for a while – why does this constitute the seventh wave of which you foretell?

It is precisely because these trends have been around for a while, and are starting to mature and influence each other, that they are building to form something really significant.  Take Open Data for instance where governments have been at the forefront – I have reported before about the almost daily announcements of open government data initiatives.  The announcement from the Dutch City of Enschede this week not only talks about their data but also about the open sourcing of the platform they use to manage and publish it, so that others can share in the way they do it.

In the world of libraries, the Ontology Engineering Group (OEG) at the  Universidad Politécnica de Madrid are providing a contribution of linked bibliographic data to the gathering mass, alongside the British and Germans, with 2.4 Million bibliographic records from the Spanish National Library.  This adds weight to the arguments for a Linked Data future for libraries proposed by the Library of Congress and Stanford University.

I might find some of the activities in the Cloud Computing short-sighted and depressing, yet already the concept of housing your data somewhere other than in a local datacenter is becoming accepted in most industries.

Enterprise use of Linked Data by leading organisations such as the BBC who are underpinning their online Olympics coverage with it are showing that it is more that a research tool, or the province only of the open data enthusiasts.

Data Marketplaces are emerging to provide platforms to share and possibly monetise your data.  An example that takes this one step further is from the leading Semantic Web technology company, Talis.  Kasabi introduces the data mixing, merging, and standardised querying of Linked Data into to the data publishing concept.  This potentially provides a platform for refining and mixing raw data in to new data alloys and products more valuable and useful than their component parts.  An approach that should stimulate innovation both in the enterprise and in the data enthusiast community.

The Big Data community is demonstrating that there are solutions, to handling the vast volumes of data we are producing, that require us to move out of the silos of relational databases towards a mixed economy.  Programs need to move – not the data, NoSQL databases, Hadoop, map/reduce, these are are all things that are starting to move out of the labs and the hacker communities into the mainstream.

The Social Networking industry which produces tons of data is a rich field for things like sentiment analysis, trend spotting, targeted advertising, and even short term predictions – innovation in this field has been rapid but I would suggest a little hampered by delivering closed individual solutions that as yet do not interact with the wider world which could place them in context.

I wrote about a while back.  An initiative from the search engine big three to encourage the SEO industry to embed simple structured data in their html.  The carrot they are offering for this effort is enhanced display in results listings – Google calls these Rich Snippets.  When first announce, the folks concentrated on Microdata as the embedding format – something that wouldn’t frighten the SEO community horses too much.  However they did [over a background of loud complaining from the Semantic Web / Linked Data enthusiasts that RDFa was the only way] also indicate that RDFa would be eventually supported.  By engaging with SEO folks on terms that they understand, this move from from had the potential to get far more structured data published on the Web than any TED Talk from Sir Tim Berners-Lee, preaching from people like me, or guidelines from governments could ever do.

The above short list of pebble stirring waves is both impressive in it’s breadth and encouraging in it’s potential, yet none of them are the stuff of a seventh wave.

So what caused me to open up my Macbook and start writing this.  It was a post from Manu Sporny, indicating that Google were not waiting for RDFa 1.1 Lite (the RDF version that will support) to be ratified.  They are already harvesting, and using, structured information from web pages that has been encoded using RDF.  The use of this structured data has resulted in enhanced display on the Google pages with items such as event date & location information,and recipe preparation timings.

Manu references sites that seem to be running Drupal, the open source CMS software, and specifically a Drupal plug-in for rendering data encoded as RDFa.  This approach answers some of the critics of embedding data into a site’s html, especially as RDF, who say it is ugly and difficult to understand.  It is not there for humans to parse or understand and, with modules such as the Drupal one, humans will not need to get there hands dirty down at code level.  Currently supports a small but important number of ‘things’ in it’s recognised vocabularies.  These, currently supplemented by GoodRelations and Recipes, will hopefully be joined by others to broaden the scope of descriptive opportunities.

So roll the clock forward, not too far, to a landscape where a large number of sites (incentivised by the prospect of listings as enriched as their competitors results) are embedding structured data in their pages as normal practice.  By then most if not all web site delivery tools should be able to embed the RDF data automatically.  Google and the other web crawling organisations will rapidly build up a global graph of the things on the web, their types, relationships and the pages that describe them.  A nifty example of providing a very specific easily understood benefit in return for a change in the way web sites are delivered, that results in a global shift in the amount of structured data accessible for the benefit of all.  Google Fellow and SVP Amit Singhal recently gave insight into this Knowledge Graph idea.

The Semantic Web / Linked Data proponents have been trying to convince everyone else of the great good that will follow once we have a web interlinked at the data level with meaning attached to those links.  So far this evangelism has had little success.  However, this shift may give them what they want via an unexpected route.

Once such a web emerges, and most importantly is understood by the commercial world, innovations that will influence the way we interact will naturally follow.  A Google TV, with access to such rich resource, should have no problem delivering an enhanced viewing experience by following structured links embedded in a programme page to information about the cast, the book of the film, the statistics that underpin the topic, or other programmes from the same production company.  Our iPhone version next-but-one, could be a personal node in a global data network, providing access to relevant information about our location, activities, social network, and tasks.

These slightly futuristic predictions will only become possible on top of a structured network of data, which I believe is what could very well immerge if you follow through on the signs that Manu is pointing out.  Reinforced by, and combining with, the other developments I reference earlier in this post, I believe we may well have a seventh wave approaching.  Perhaps I should look at the beach again in five years time to see if I was right.

Wave photo from Nathan Gibbs in Flickr
Declarations – I am a Kasabi Partner and shareholder in Kasabi parent company Talis.

A Kasabi Day at Semtech Berlin

I spent yesterday at the first day of excellent Semantic Tech and Business Conference 2012 in Berlin.  It was a good day covering a wide range of topics, a great range of speakers and talks, and most encouragingly some really good conversations in the breaks.  I had the pleasure of presenting the opening session The Simple Power of the Link which seemed to provide a good grounding introduction to what to some is a fairly complex topic.  My slides are available on Slideshare, and I provided a background article on, if you want to check them out.

In my role as guest blogger for I created an overview of Day 1 sessions I attended and enjoyed.

kasabi_logo_4col Something that struck me throughout the day was the number of references to the Kasabi Data Marketplace during the day.  Well yes, you might say, you are a Kasabi Partner and Kasabi Staff members Knud Möller and Benjamin Nowack gave presentations.  Of course you would be right.  However, I also noticed references to it in other presentations and in general conversations.

For example keynote speaker and ‘Semantic Fireman’ Bart van Leuwen, share the fact that there is an open publicly available version of the Amsterdam Fire Service Data hosted in Kasabi.  The reasoning he gave for doing this was that once he had decided to make his data open, he needed somewhere easy to put it, that did not require him to worry about things like infrastructure, servers, and scaling.  Kasabi provides that, plus the Sparql and APi access that enables people to play with his data, which he encouraged people to do.

Other reasons for referencing Kasabi seemed to be two fold.  Firstly, as with Bart, it is an easy cloud-based place to put your data and let it handle access, APIs and loadings that you initially have no idea about.  Secondly, and far less clearly understood, is the idea that the team at Kasabi may have an insight into a possible business model for delivering generic services with Liked Data at the core.

This is not intended to be a sales pitch for Kasabi, the team there can do that very well themselves.  I just found it interesting to note that it seems to be hitting a spot in the Semantic Web / Linked Data consciousness that nothing else quite is at the moment.

Declarations – I am a Kasabi Partner and shareholder in Kasabi parent company Talis.