A presentation at Geek in the Park in in Royal Leamington Spa, UK by Drew McLellan
What Brian Cant Never Taught You About Metadata a presentation by drew mclellan
What Brian Cant Never Taught You About Metadata a presentation by drew mclellan Everything You Know About Metadata is Wrong
What Brian Cant Never Taught You About Metadata a presentation by drew mclellan How I Learned to Stop Worrying and Love The Data
geek in the park
metadata, html, robots, 1970s/80s
childrens television programming, tofu,
truth, honesty, and some made up rules
stated as absolutes.
enough about you, let’s talk about me for a minute. allinthehead.com edgeofmyseat.com webstandards.org microformats.org drew mclellan
enough about me, let’s talk about Cant for a while. brian cant
Brian Cant taught us lots of things.
Brian Cant taught us lots of things. Brian Cant Other sources everything I know
20%
80%
Other important stuff Metadata of that 80%
1%
99%
THERE’S MORE THAN BRIAN WAS LETTING ON.
Brian taught us to share. sharing is important
The web is all about sharing just ask Humpty and Jemima.
We use the web to share like Brian taught us. knowledge information data the web is primarily a tool for sharing
What are we sharing? all types of data common obscure
Common data names addresses dates & times things for sale reviews
Obscure data your auntie’s hat collection every place Paul McCartney has sneezed since 1962 how many days a web server has been up
this is why we publish it All data is potentially useful to someone else.
Brian taught us to tell the truth
Data is only useful if it’s correctly described
Data is only useful if it’s correctly described but we’ll come onto that in a bit.
So, metadata then
So, metadata then Metadata is data about other data enables you to unlock data
Bus timetable
Audio
Photographs
Metadata is everywhere Often hidden away, but doesn’t deserve to be.
The more exposed metadata is, the more useful it is and the more useful the original data becomes.
Rule #1 Beware dark data.
Rule #1 Beware dark data. Hidden data gets forgotten and goes out of date.
it’s not complicated metadata is simpler than it sounds
sunny
sunny yesterday’s weather:
sl6 8aj
sl6 8aj postcode:
1980-02-21
1980-02-21 date of birth:
Information is data put into context data is grand on its own without context it cannot inform
Metadata puts data into context turns it into information information is even better than data
information is 3 times better than data Betterness
Metadata isn’t new to the web no more than stalking is new to Facebook
XML is a good example <building> <colour>orange</colour> <type>house</type> <doors>1</doors> <windows>0</windows> </building>
XML is a good example define your own schema describe the data you have
Semantics and metadata aren’t identical concepts different ideas on the web there’s a lot of overlap
HTML has a basic set of tags Some enable us to communicate meaning Some put data into context Often both these things
HTML has a basic set of tags Some enable us to communicate meaning
<p> <h1> <h2> <h3> <h4> <h5> <h6> (useful but not great metadata)HTML has a basic set of tags Some put data into context Often both these things
<title> <address>HTML enables us to add metadata the HTML class attribute <span class=“name”>Drew</span> this is a very useful technique
HTML enables us to add metadata this makes HTML extremely flexible a good thing indeed
There’s a really obvious example of metadata use in HTML surely you’ve already thought of it
HTML META a.k.a. meta tags been around since HTML 2
HTML META keywords description author copyright date Dublin Core the spec lists no legal values
HTML META
<meta name="keywords" content="vacation, Greece, sunshine" /> <meta name="description" content="My holiday in Greece" /> <meta name="author" content="Drew McLellan" /> <meta name="copyright" content="Drew McLellan 2008" /> <meta name="date" content="2008-06-12T12:03:56+0100" /> <meta name="DC.identifier" content="http:// www.ietf.org/rfc/rfc1866.txt" />The use of META elements hasn’t been plain sailing
The use of META elements hasn’t been plain sailing many web designers don’t know how to use them properly leading to inconsistent use dark data
Many misunderstand the purpose I’m looking at you, web marketeers and you, so-called SEO experts
Many misunderstand the purpose META tags aren’t for search engines META tags are used by search engines
Many misunderstand the purpose META tags aren’t for search engines META tags are used by search engines META tags are for describing the data
Many misunderstand the purpose to provide a means to discover that the data set exists and how it might be obtained or accessed; and to document the content, quality, and features of a data set, indicating its fitness for use
Rule #2 The more you lie, the less you can be trusted and the less valuable your info becomes.
Rule #2 This is something Brian Cant taught us.
Rule #3 The fewer distinct consumers, the less valuable the metadata becomes over time.
Rule #3 Only search engines really used META keywords, descriptions Authors began writing targeted for search engines “how do I get well ranked?” vs. “how do I describe this data?”
Rule #3 Search engines can no longer trust keywords, descriptions Abuse has spoiled it for everyone Brian Cant never said anything about that.
What have we learned so far?
What have we learned so far? Sharing is good - the web is for sharing Metadata isn’t new IRL or on the web HTML gives us ways to express metadata It only works if we tell the truth
We need thems robots on our side Part 2:
or against us robots are either with us
so we’d better co-operate we don’t want them against us
and effort robots can save us time
yay.
tofu robot says: data is everywhere
There are lots of idioms for data Opening times Event details Addresses
Idioms are good they’re not always formal you don’t need to be formal to be understood
Informal is good but consistency is important let’s look at why...
Humans are quick to adapt we can easily re-evaluate and adjust we can climb stairs without a trip to the workshop
Robots prefer patterns they rely on known patterns patterns can be formal or informal must be consistent and repeatable
Humans like patterns too we like routine we like repeating patterns robots like patterns because they are repeatable we like patterns because we don’t want to think thinking is hard, uncomfortable and inconvenient.
thinking Hard Uncomfortable Inconvenient Prone to error
45%
29%
4%
21%
So as it turns out what’s good for thems robots is good for us too
it’s not complicated metadata is good - so we want to use it our metadata challenge need to embrace reusable patterns avoid dark data avoid specific data for any consumer make it easy to be truthful embrace existing idioms reuse existing technology
remember this? <span class=“name”>Drew</span>
remember this? <span class=“name”>Drew</span> avoid dark data avoid specific data for any consumer make it easy to be truthful embrace existing idioms reuse existing technology
remember this? <span class=“name”>Drew</span> avoid dark data avoid specific data for any consumer make it easy to be truthful embrace existing idioms reuse existing technology need to embrace reusable patterns
microformats just a bunch of patterns
names and addresses hCard - based on VCARD given-name family-name email url tel title org street-address locality
names and addresses hCard - based on VCARD
<p class=“vcard”>The announcement followed calls by <span class=“org”>Apple</span> <span class=“role”>Chief Executive</span> <span class=“fn”>Steve Jobs</span> earlier this year...
</p>names and addresses hCard - based on VCARD
<p class=“ vcard ”>The announcement followed calls by <span class=“ org ”>Apple</span> <span class=“ role ”>Chief Executive</span> <span class=“ fn ”>Steve Jobs</span> earlier this year...
</p>events and dates hCalendar - based on iCAL dtstart dtend location url description summary
reviews hReview item reviewer rating description summary photo
relationships XFN contact acquaintance met co-worker friend colleague neighbor child parent sweetheart crush me
many more licenses tags date-based feeds directories products payments geolocation more
remember this? <span class=“name”>Drew</span> avoid dark data avoid specific data for any consumer make it easy to be truthful embrace existing idioms reuse existing technology need to embrace reusable patterns
remember this? <span class=“name”>Drew</span> avoid dark data avoid specific data for any consumer make it easy to be truthful embrace existing idioms reuse existing technology need to embrace reusable patterns
Brian Cant never knew this but I bet he’d be thrilled.
microformats are good a humane method for using metadata on the web easy for us to implement readable by our robotic friends
hCard
hCalendar
For robot masters http://microformats.org/wiki/parsers http://tools.microformatic.com/
For humans http://microformats.org/
For humans http://oreilly.com/
For humans http://microformatique.com/book/
What Brian Cant Never Taught You About Metadata. So that’s Thank you. http://allinthehead.com/presentations
http://flickr.com/photos/gperez/4393118/ http://flickr.com/photos/warmnfuzzy/466382466/ http://flickr.com/photos/stevegarfield/194648339/ http://flickr.com/photos/donsolo/2385041554/ Creative Commons photos used:
Or "Everything You Know About Metadata is Wrong".
Or "How I Learned to Stop Worrying and Love The Data".
It's a presentation about metadata, HTML, robots, 1970s/80s children’s television programming, tofu, truth, honesty, and some made up rules stated as absolutes.
The following resources were mentioned during the presentation or are useful additional information.
An audio recording and transcript of the presentation.
Here’s what was said about this presentation on social media.
humgover from gitp. fab night so huge thanks to @trovster for all his hard work and @hicksdesign and @drewm for a pair of fab talks
— Cole Henley (@cole007) August 10, 2008