Help spotlight library innovation and send a library linked data practitioner to the SemTechBiz conference in San Francisco, June 2-5

Unknown oclc_logo

Update from organisers:
We are pleased to announce that Kevin Ford, from the Network Development and MARC Standards Office at the Library of Congress, was selected for the Semantic Spotlight on Innovation for his work with the Bibliographic Framework Initiative (BIBFRAME) and his continuing work on the Library of Congress’s Linked Data Service ( In addition to being an active contributor, Kevin is responsible for the BIBFRAME website; has devised tools to view MARC records and the resulting BIBFRAME resources side-by-side; authored the first transformation code for MARC data to BIBFRAME resources; and is project manager for The Library of Congress’ Linked Data Service. Kevin also writes and presents frequently to promote BIBFRAME, ID.LOC.GOV, and educate fellow librarians on the possibilities of linked data.

Without exception, each nominee represented great work and demonstrated the power of Linked Data in library systems, making it a difficult task for the committee, and sparking some interesting discussions about future such spotlight programs.

Congratulations, Kevin, and thanks to all the other great library linked data projects nominated!


OCLC and LITA are working to promote library participation at the upcoming Semantic Technology & Business Conference (SemTechBiz). Libraries are doing important work with Linked Data. wants to spotlight innovation in libraries, and send one library presenter to the SemTechBiz conference expenses paid.

SemTechBiz brings together today’s industry thought leaders and practitioners to explore the challenges and opportunities jointly impacting both business leaders and technologists. Conference sessions include technical talks and case studies that highlight semantic technology applications in action. The program includes tutorials and over 130 sessions and demonstrations as well as a hackathon, start-up competition, exhibit floor, and networking opportunities.  Amongst the great selection of speakers you will find yours truly!

If you know of someone who has done great work demonstrating the benefit of linked data for libraries, nominate them for this June 2-5 conference in San Francisco. This “library spotlight” opportunity will provide one sponsored presenter with a spot on the conference program, paid travel & lodging costs to get to the conference, plus a full conference pass.

Nominations for the Spotlight are being accepted through May 10th.  Any significant practical work should have been accomplished prior to March 31st 2013 — project can be ongoing.   Self-nominations will be accepted

Even if you do not nominate anyone, the Semantic Technology and Business Conference is well worth experiencing.  As supporters of the Library Spotlight OCLC and LITA members will get a 50% discount on a conference pass – use discount code “OCLC” or “LITA” when registering.  (Non members can still get a 20% discount for this great conference by quoting code “FCLC”)

For more details checkout the OCLC Innovation Series page.

Thank you for all the nominations we received for the first Semantic Spotlight on Innovation in Libraries.


Comment   or   Contact us

Day three of the Semantic Tech & Business Conference in San Francisco brought us a panel to discuss, populated by an impressive array of names and organisations:

IMG_0306 Ivan Herman, World Wide Web Consortium
Alexander Shubin, Yandex
Dan Brickley, at Google
Evan Sandhaus, New York Times Company
Jeffrey W. Preston, Disney Interactive Media Group
Peter Mika, Yahoo!
R.V. Guha, Google
Steve Macbeth, Microsoft

This well attended panel started with a bit of a crisis – the stage in the room was not large enough to seat all of the participants causing a quick call out for bar seats and much microphone passing.  Somewhat reflective of the crisis of concern about the announcement of, immediately prior to last year’s event which precipitated the hurried arrangement of a birds of a feather session to settle fears and disquiet in the semantic community.

Asking a fellow audience member what they thought of this session, they replied that the wasn’t much new said.  In my opinion I think that is a symptom of good things happening around the initiative.  He was right in saying that there was nothing substantive said, but there were some interesting pieces that came out of what the participants had to say.  Guha indicated that Google were already seeing that 7-10% of pages crawled already contained mark-up, surprising growth in such a short time.  Steve Macbeth confirmed that Microsoft were also seeing around 7%.

Another unexpected but interesting insight from Microsoft was that they are looking to use mark-up as a way to pass data between applications in Windows 8.  All the search engine folks were playing it close when asked what they were actually using the structured data they were capturing from mark-up – lots of talk about projects around better search algorithms and indexing.  Guha, indicated that the data was not siloed inside Google.  As with any other data it was used across the organisation, including within the Google Knowledge Graph functionality.

Jeffrey Preston responded to a question about the tangible benefits of applying mark-up by describing how kids searching for games on the Disney site were being directed more accurately to the game as against pages that referenced it.  Evan Sandhaus described how it enabled a far easier integration with a vendor who could access their article data without having to work with a specific API.  Guha spoke about a Veterans job search site was created with the Department of Defence as they could constrain their search only to sites which only included mark-up and identified jobs as appropriate for Veterans.

In questions from the floor, the panel explained the best way of introducing schema extensions, using the IPTC rNews as an example – get industry consensus to provide a well formed proposal and then be prepared to be flexible.   All done via the W3C hosted Public Vocabs List.

All good progress in only a year!

Richard Wallis is Technology Evangelist at OCLC and Founder of Data Liberate

Comment   or   Contact us

San Francisco So where have I been?   I announce that I am now working as a Technology Evangelist for the the library behemoth OCLC, and then promptly disappear.  The only excuse I have for deserting my followers is that I have been kind of busy getting my feet under the OCLC table, getting to know my new colleagues, the initiatives and projects they are engaged with, the longer term ambitions of the organisation, and of course the more mundane issues of getting my head around the IT, video conferencing, and expense claim procedures.

It was therefore great to find myself in San Francisco once again for the Semantic Tech & Business Conference (#SemTechBiz) for what promises to be a great program this year.  Apart from meeting old and new friends amongst those interested in the potential and benefits of the Semantic Web and Linked Data, I am hoping for a further step forward in the general understanding of how this potential can be realised to address real world challenges and opportunities.

As Paul Miller reported, the opening session contained an audience with 75% first time visitors.  Just like the cityscape vista presented to those attending the speakers reception yesterday on the 45th floor of the conference hotel, I hope these new visitors get a stunningly clear view of the landscape around them.

Of course I am doing my bit to help on this front by trying to cut through some of the more technical geek-speak. Tuesday 8:00am will find me in Imperial Room B presenting The Simple Power of the Link – a 30 minute introduction to Linked Data, it’s benefits and potential without the need to get you head around the more esoteric concepts of Linked Data such as triple stores, inference, ontology management etc.  I would not only recommend this session for an introduction for those new to the topic, but also for those well versed in the technology as a reminder that we sometimes miss the simple benefits when trying to promote our baby.

For those interested in the importance of these techniques and technologies to the world of Libraries Archives and Museums I would also recommend a panel that I am moderating on Wednesday at 3:30pm in Imperial B – Linked Data for Libraries Archives and Museums.  I will be joined by LOD-LAM community driver Jon Voss, Stanford Linked Data Workshop Report co-author Jerry Persons, and  Sung Hyuk Kim from the National Library of Korea.  As moderator I will, not only let the four of us make small presentations about what is happening in our worlds, I will be insistent that at least half the time will be there for questions from the floor, so bring them along!

I am not only surfacing at Semtech, I am beginning to see, at last, the technologies being discussed surfacing as mainstream.  We in the Semantic Web/Linked world are very good at frightening off those new to it.  However, driven by pragmatism in search of a business model and initiatives such as, it is starting to become mainstream buy default.  One very small example being Yahoo’!s Peter Mika telling us, in the Semantic Search workshop, that RDFa is the predominant format for embedding structured data within web pages.

Looking forward to a great week, and soon more time to get back to blogging!

Comment   or   Contact us

wikimedia One of the more eagerly awaited presentations at the Semantic Tech & Business Conference in Berlin today was a late addition to the program from Denny Vrandecic.  With the prominence of Dbpedia in the Linked Open Data Cloud, anything new from Wikipedia with data in it was bound to attract attention, and we were not disappointed.

P1000770Denny started by telling us that from March he would be moving to Berlin to work for the Wikimedia Foundation on WikiData.

He then went on to explain that the rich Wikipedia resource may have much of the world’s information but does not have all the answers.  There vast differences in coverage between language versions for instance.  Also it is not good at answering questions such as what are the 10 largest cities with a female mayor. You get some cities back but most if not all of them do not have a female mayor.   One way to address this issue, that has proliferated in Wikipedia is Lists.  The problem with lists is that there are so many of them, in several languages, with often duplicates, and then there are the array of lists of lists.

We must accept Wikipedia doesn’t have all the answers – humans can read articles but computers can not understand the meaning.  WikiData created articles on a topic will point to the relevant wikipedia articles in all languages.

Dbpedia has been a great success at extracting information from Wikipedia info-boxes and publishing it as data, but it is not editable.  WikiData will turn that model on it’s head, by providing an editable environment for data that will then be used to automatically populate the info-boxes.  WikiData will also reference secondary databases. For example indicating that the CIA World Factbook provides a value for something.

WikiData will not define the truth, it will collect the references to the data.

Denny listed the objectives of the WikiData project to be:

  • Provide a database of the world’s knowledge that anyone can edit
  • Collect references and quotes for millions of data items
  • Engage a sustainable community that collects data from everywhere in a machine-readable way
  • Increase the quality and lower the maintenance costs of Wikipedia and related projects
  • Deliver software and community best practices enabling others to engage in projects of data collection and provisioning

WikiData phase 1, which includes creating one WikiData page for each Wikipedia entity which then lists representations in each language.  Those individual language versions will then pull the language links from WikiData, should be complete in the summer.

The second phase will include the centralisation of data vales for info-boxes and then have the Wikipedias populate their info-boxes from WikiData.

The final phase will be to enable inline queries against WikiData to be made from Wikipedias with the results surfaced in several formats.

Denny did not provide a schedule for the second an third phases.

This is all in addition to the ability to provide freely, re-usable, machine-readable access to the world’s data.

The beginnings of an interesting project from WikiMedia that could radically influence the data landscape – well woth watching as it progresses.

Comment   or   Contact us

I spent yesterday at the first day of excellent Semantic Tech and Business Conference 2012 in Berlin.  It was a good day covering a wide range of topics, a great range of speakers and talks, and most encouragingly some really good conversations in the breaks.  I had the pleasure of presenting the opening session The Simple Power of the Link which seemed to provide a good grounding introduction to what to some is a fairly complex topic.  My slides are available on Slideshare, and I provided a background article on, if you want to check them out.

In my role as guest blogger for I created an overview of Day 1 sessions I attended and enjoyed.

kasabi_logo_4col Something that struck me throughout the day was the number of references to the Kasabi Data Marketplace during the day.  Well yes, you might say, you are a Kasabi Partner and Kasabi Staff members Knud Möller and Benjamin Nowack gave presentations.  Of course you would be right.  However, I also noticed references to it in other presentations and in general conversations.

For example keynote speaker and ‘Semantic Fireman’ Bart van Leuwen, share the fact that there is an open publicly available version of the Amsterdam Fire Service Data hosted in Kasabi.  The reasoning he gave for doing this was that once he had decided to make his data open, he needed somewhere easy to put it, that did not require him to worry about things like infrastructure, servers, and scaling.  Kasabi provides that, plus the Sparql and APi access that enables people to play with his data, which he encouraged people to do.

Other reasons for referencing Kasabi seemed to be two fold.  Firstly, as with Bart, it is an easy cloud-based place to put your data and let it handle access, APIs and loadings that you initially have no idea about.  Secondly, and far less clearly understood, is the idea that the team at Kasabi may have an insight into a possible business model for delivering generic services with Liked Data at the core.

This is not intended to be a sales pitch for Kasabi, the team there can do that very well themselves.  I just found it interesting to note that it seems to be hitting a spot in the Semantic Web / Linked Data consciousness that nothing else quite is at the moment.

Declarations – I am a Kasabi Partner and shareholder in Kasabi parent company Talis.

Comment   or   Contact us

This post was initially just going to be about the presentation The Simple Power of the Link that I gave in the opening session of The Semantic Tech & Business Conference in London earlier this week.  However I realise now that it’s title, chosen to draw attention to the core utility and power of the basic links in Linked Data, has resonance and relevance for the conference as a whole.

This was the first conference in the long running Semtech series  to venture in to Europe as well as include the word business in it’s name.  This obviously recognises the move, from San Francisco based geekdom to global pragmatic usefulness, that the Semantic Web in general and Linked Data in particular is in the process of undertaking.  A maturing of the market that we in Talis Consulting can attest to having assisted many organisations with their understanding and adoption of Linked Data.   In addition to those attendees, that I would characterise as the usual suspects, the de-geeking of the topic attracted many who would not previously of visited such a conference.   An unscientific show of hands, prior to the session in which I presented, indicated that about half of the audience were new to this Semantic stuff.

Traffic on our stand in the exhibition area also supported this view, with most discussions being about how the techniques and technologies could be applied, and how Talis could help, as against esoteric details of the technologies themselves.  Linking people with real world issues and opportunities, with the people that have the experience to help them, was an obvious benefit of the event.  In addition to as having some great keynotes, as Rob described.

So, back to my initial purpose.

The presentation “The Simple Power of the Link” was an attempt to simplify the core principles of  Linked Data so as to highlight their implicit benefits.  The aforementioned geekdom that has surrounded the Semantic Web has, unfairly in my mind, gained the topic a reputation for being complex and difficult and hence not really applicable in the mainstream.  A short Google on the topic will rapidly turn up a set of esoteric research papers and discussions around things such as inferencing, content-negotiation and Sparql – a great way to put off those looking for an easy way in.

There is a great similarity to the way something like vehicle engineering is promoted with references to turbos, self levelling suspensions, flappy-paddle gearshifts, iPod docks and the like – missing the point for the [just landed on the planet] new to the topic – that the major utility of any vehicle is that it goes, stops, steers, and gets you from A to B.

The core utility of Linked Data is The Link.  A simple way to indicate the relationship between one thing and another thing.

As things in Linked Data are represented by http URIs, which when looked up should return you some data containing links to other things, an implicit web of relationships emerges that you can follow to obtain more related information. This basic, and powerful, utility could be simply realised with data encoded as triples in RDF, served from a simple file structure by a web server.

So, although things like triple stores, OWL, relational-to-RDF mapping tools, named graphs, SPARQL 1.1, and [the dreaded] httpRange-14 are important issues for those embedded in Linked Data, the overwhelming benefits that accrue from applying Linked Data come from those basic triples – the links.  As a community I believe that we can be rightly accused of not making that clear enough.  Something that my colleagues in Talis Consulting and I attempt to address whenever possible.  Especially at our open Linked Data events.

This post was also published on the Talis Consulting Blog
Comment   or   Contact us