What Brian Cant Never Taught You About Metadata a presentation by drew mclellan

What Brian Cant Never Taught You About Metadata a presentation by drew mclellan Everything You Know About Metadata is Wrong

What Brian Cant Never Taught You About Metadata a presentation by drew mclellan How I Learned to Stop Worrying and Love The Data

geek in the park metadata, html, robots, 1970s/80s childrens television programming, tofu,
truth, honesty, and some made up rules stated as absolutes.

enough about you, let’s talk about me for a minute. allinthehead.com edgeofmyseat.com webstandards.org microformats.org drew mclellan

enough about me, let’s talk about Cant for a while. brian cant

Brian Cant taught us lots of things.

Brian Cant taught us lots of things. Brian Cant Other sources everything I know

20%

80%

Other important stuff Metadata of that 80%

1%

99%

THERE’S MORE THAN BRIAN WAS LETTING ON.

Brian taught us to share. sharing is important

The web is all about sharing just ask Humpty and Jemima.

We use the web to share like Brian taught us. knowledge information data the web is primarily a tool for sharing

What are we sharing? all types of data common obscure

Common data names addresses dates & times things for sale reviews

Obscure data your auntie’s hat collection every place Paul McCartney has sneezed since 1962 how many days a web server has been up

this is why we publish it All data is potentially useful to someone else.

Brian taught us to tell the truth

Data is only useful if it’s correctly described

Data is only useful if it’s correctly described but we’ll come onto that in a bit.

So, metadata then

So, metadata then Metadata is data about other data enables you to unlock data

Bus timetable

Audio

Photographs

Metadata is everywhere Often hidden away, but doesn’t deserve to be.

The more exposed metadata is, the more useful it is and the more useful the original data becomes.

Rule #1 Beware dark data.

Rule #1 Beware dark data. Hidden data gets forgotten and goes out of date.

it’s not complicated metadata is simpler than it sounds

sunny

sunny yesterday’s weather:

sl6 8aj

sl6 8aj postcode:

1980-02-21

1980-02-21 date of birth:

Information is data put into context data is grand on its own without context it cannot inform

Metadata puts data into context turns it into information information is even better than data

information is 3 times better than data Betterness

Metadata isn’t new to the web no more than stalking is new to Facebook

XML is a good example <building> <colour>orange</colour> <type>house</type> <doors>1</doors> <windows>0</windows> </building>

XML is a good example define your own schema describe the data you have

Semantics and metadata aren’t identical concepts different ideas on the web there’s a lot of overlap

HTML has a basic set of tags Some enable us to communicate meaning Some put data into context Often both these things

HTML has a basic set of tags Some enable us to communicate meaning

<p> <h1> <h2> <h3> <h4> <h5> <h6> (useful but not great metadata)

HTML has a basic set of tags Some put data into context Often both these things

<title> <address>

HTML enables us to add metadata the HTML class attribute <span class=“name”>Drew</span> this is a very useful technique

HTML enables us to add metadata this makes HTML extremely flexible a good thing indeed

There’s a really obvious example of metadata use in HTML surely you’ve already thought of it

HTML META a.k.a. meta tags been around since HTML 2

HTML META keywords description author copyright date Dublin Core the spec lists no legal values

HTML META

<meta name="keywords" content="vacation, Greece, sunshine" /> <meta name="description" content="My holiday in Greece" /> <meta name="author" content="Drew McLellan" /> <meta name="copyright" content="Drew McLellan 2008" /> <meta name="date" content="2008-06-12T12:03:56+0100" /> <meta name="DC.identifier" content="http:// www.ietf.org/rfc/rfc1866.txt" />

The use of META elements hasn’t been plain sailing

The use of META elements hasn’t been plain sailing many web designers don’t know how to use them properly leading to inconsistent use dark data

Many misunderstand the purpose I’m looking at you, web marketeers and you, so-called SEO experts

Many misunderstand the purpose META tags aren’t for search engines META tags are used by search engines

Many misunderstand the purpose META tags aren’t for search engines META tags are used by search engines META tags are for describing the data

Many misunderstand the purpose to provide a means to discover that the data set exists and how it might be obtained or accessed; and to document the content, quality, and features of a data set, indicating its fitness for use

Rule #2 The more you lie, the less you can be trusted and the less valuable your info becomes.

Rule #2 This is something Brian Cant taught us.

Rule #3 The fewer distinct consumers, the less valuable the metadata becomes over time.

Rule #3 Only search engines really used META keywords, descriptions Authors began writing targeted for search engines “how do I get well ranked?” vs. “how do I describe this data?”

Rule #3 Search engines can no longer trust keywords, descriptions Abuse has spoiled it for everyone Brian Cant never said anything about that.

What have we learned so far?

What have we learned so far? Sharing is good - the web is for sharing Metadata isn’t new IRL or on the web HTML gives us ways to express metadata It only works if we tell the truth

We need thems robots on our side Part 2:

or against us robots are either with us

so we’d better co-operate we don’t want them against us

and effort robots can save us time

yay.

tofu robot says: data is everywhere

There are lots of idioms for data Opening times Event details Addresses

Idioms are good they’re not always formal you don’t need to be formal to be understood

Informal is good but consistency is important let’s look at why...

Humans are quick to adapt we can easily re-evaluate and adjust we can climb stairs without a trip to the workshop

Robots prefer patterns they rely on known patterns patterns can be formal or informal must be consistent and repeatable

Humans like patterns too we like routine we like repeating patterns robots like patterns because they are repeatable we like patterns because we don’t want to think thinking is hard, uncomfortable and inconvenient.

thinking Hard Uncomfortable Inconvenient Prone to error

45%

29%

4%

21%

So as it turns out what’s good for thems robots is good for us too

it’s not complicated metadata is good - so we want to use it our metadata challenge need to embrace reusable patterns avoid dark data avoid specific data for any consumer make it easy to be truthful embrace existing idioms reuse existing technology

remember this? <span class=“name”>Drew</span>

remember this? <span class=“name”>Drew</span> avoid dark data avoid specific data for any consumer make it easy to be truthful embrace existing idioms reuse existing technology

remember this? <span class=“name”>Drew</span> avoid dark data avoid specific data for any consumer make it easy to be truthful embrace existing idioms reuse existing technology need to embrace reusable patterns

microformats just a bunch of patterns

names and addresses hCard - based on VCARD given-name family-name email url tel title org street-address locality

names and addresses hCard - based on VCARD

<p class=“vcard”>

The announcement followed calls by <span class=“org”>Apple</span> <span class=“role”>Chief Executive</span> <span class=“fn”>Steve Jobs</span> earlier this year...

</p>

names and addresses hCard - based on VCARD

<p class=“ vcard ”>

The announcement followed calls by <span class=“ org ”>Apple</span> <span class=“ role ”>Chief Executive</span> <span class=“ fn ”>Steve Jobs</span> earlier this year...

</p>

events and dates hCalendar - based on iCAL dtstart dtend location url description summary

reviews hReview item reviewer rating description summary photo

relationships XFN contact acquaintance met co-worker friend colleague neighbor child parent sweetheart crush me

many more licenses tags date-based feeds directories products payments geolocation more

remember this? <span class=“name”>Drew</span> avoid dark data avoid specific data for any consumer make it easy to be truthful embrace existing idioms reuse existing technology need to embrace reusable patterns

remember this? <span class=“name”>Drew</span> avoid dark data avoid specific data for any consumer make it easy to be truthful embrace existing idioms reuse existing technology need to embrace reusable patterns

Brian Cant never knew this but I bet he’d be thrilled.

microformats are good a humane method for using metadata on the web easy for us to implement readable by our robotic friends

hCard

hCalendar

For robot masters http://microformats.org/wiki/parsers http://tools.microformatic.com/

For humans http://microformats.org/

For humans http://oreilly.com/

For humans http://microformatique.com/book/

What Brian Cant Never Taught You About Metadata. So that’s Thank you. http://allinthehead.com/presentations

http://flickr.com/photos/gperez/4393118/ http://flickr.com/photos/warmnfuzzy/466382466/ http://flickr.com/photos/stevegarfield/194648339/ http://flickr.com/photos/donsolo/2385041554/ Creative Commons photos used: