Can Your Website Be Your API?

A presentation at Web Standards Group London Meetup in October 2006 in London, UK by Drew McLellan

Slide 1

Slide 1

Can your website be your API? Drew McLellan Web Standards Group Microformats, UOW, DECSE Hello!

  • allinthehead.com, microformatic.com
  • strategy lead for web standards project
  • also work at Yahoo! I’m not speaking for Yahoo! today.

Slide 2

Slide 2

Can your website be your API? Can your website be your API?

  • attention grabbing headline that needs some unpacking Less sensationally, it could be asked like this...

Slide 3

Slide 3

Could my website be an API? Or perhaps more fully ...

Slide 4

Slide 4

Can I add enough semantic information to the pages I already publish so that they could replace the function of a dedicated API ? Whichever way you phrase it, the answer is the same...

Slide 5

Slide 5

No

Slide 6

Slide 6

No No but yeah, but no, but yeah.

Slide 7

Slide 7

Brain > Code We've been hearing a lot about microformats and how they enable us to mark up data in a meaningful way. We do this so that the information becomes useful not just to complex parsers - like the human brain - but to relatively simple parsers, such as software. Software, of course is derived from a very small subset of the human brain, and therefore can't currently encompass the extreme complexity of the rules and processes the human brain goes through when trying to make sense of something such as an arbitrary list of random words. Spotting a likely name in a list of random words for example, is quite easy for us but more far more complex to identify all those rules and codify them.

Slide 8

Slide 8

Fork handles? An interesting example of such a situation is with telephone technologies, and in particular speech recognition. There have been all sorts of e ! orts in the past to get speech recognition working well across phone systems. I imagine most of us have at some point played around with speech recognition software of some kind. Maybe that's PC dictation software, or maybe just an annoying IVR system. They don't tend to be so great at recognising anything but the most straightforward patterns of speech. The human brain, as a complex parser, takes very little time to adjust to di ! erent voices - even the thickest of accents are typically understandable within a few moments of listening to a person's voice. Speech recognition software - until very very recently - has found it hard to get any kind of acceptable accuracy levels even with RP. Mix in the poor audio quality of a phone signal, and this becomes an even tougher task.

Slide 9

Slide 9

File > Open A few years back I was at college with a guy who'd been helping his housemate install and configure some speech recognition software on his housemate's PC. My friend spent quite a bit of time fine tuning the program whilst his housemate was out at lectures and managed to get it working with a pretty good level of accuracy. So you can imagine my friend's surprise when his housemate reported that he was getting absolutely nowhere with the software, despite persistent e ! orts. So my friend tried it himself "File, Open" and it worked just fine. The housemate tried again "File, Open" and nothing happened. Thinking for a brief moment the housemate tried again in my friends Geordie accent, and lo and behold, it worked.

Slide 10

Slide 10

( shameless ) So whilst it's usually possible to get close to the meaning of things by having a machine attempt to process the material the human brain works with directly, it's often far more e ! ective, e " cient and reliable to provide dedicated information that a machine can understand.

Slide 11

Slide 11

For telephone systems, this resulted in the fax machine. A fax machine works by turning data signals into audio signals which can be sent over a regular audio phone line and turned back into data by the fax machine at the other end. Rather than trying to have machines understand a human voice, the machines only need to understand other machines, over the same type of line.

Slide 12

Slide 12

beep Fax machines are the APIs of the telephone world. They're a provision for machine to machine communication in and out of your company or organisation. This has big benefits, as larger amounts of data can be transferred much quicker and with greater accuracy than if there was a manual process involved.

Slide 13

Slide 13

Tel: +44 (0) 1234 432 432 Fax: +44 (0) 1234 432 433 They're not without their downsides though. Fax machines are hateful, evil machines that will eat your soul and then spit it out as an ad for a cheap car purchase plan. Once a company decides to have a fax machine and enable customers to communicate with them via fax, the company has to start publishing a di ! erent phone number for faxes. This means that customers then have to go through the extra cognitive step of figuring out which damn number to call. Doesn't sound like a particularly heavy cognitive process, does it? Or so you'd think unless you've ever had a desk in an o " ce where you're sat near the incoming fax machine.

Slide 14

Slide 14

So in the same way, your web app API is a way for other machines to talk to the machines that run your application, but as a distinct second route in to that application, a route dedicated for machines rather than humans. And like the fax machine, this is not without its costs. Typically, developing an API is in addition to the basic work needed to have your application functional. Whilst if you're clever, parts of your application such as JavaScript code running on the client can make use of that API (and therefore making good use of the work required to build the API - Flickr is a good example), it's likely that most of this work will be above and beyond what's required for your version 1.0.

Slide 15

Slide 15

Mmm APIs! But this is not a argument against APIs - far from it. An API is increasingly an essential component for any modern web application. This is an argument to provide that same functionality in a more e " cient way. An argument to reuse existing technology and resources in place of building new and spending more. This an argument to make your public facing website your API, and to do so through intelligent use of HTML-embedded data mechanisms like microformats.

Slide 16

Slide 16

Bow-wow Imagine for a moment that fax machines could operate at the sort of high pitches that dog whistles create. Whilst the screeches and beeps would be inaudible to the human ear, another fax machine could be tuned to pick them up. The result being that the audio band to which the human ear is sensitive would remain clear, enabling a normal conversation to take place at the same time on the same line. You could make a reservation AND confirm it in writing during the space of the same phone call. Of course that's pretty daft, not least because a phone line doesn't have enough fidelity to achieve that - it can only just cope with the human voice. A web page is similar. It has higher fidelity than a phone line, as we can include both the human-readable information and basic semantics for machines. That fidelity can be vastly increased, however by adding microformats to the markup. All of a sudden the information we're communicating for humans can be encoded to be understandable - in detail - by machines.

Slide 17

Slide 17

yay. This means that the same page that serves data to your human visitors can serve data to your robotic friends too. What's more, it's the exact same data representation, and not a secondary view into it to develop and maintain. Of course I'm talking about how this can be achieved with microformats, but how does this work in practise? Let's take an example from your favourite social events aggregator and mine, Upcoming.org.

Slide 18

Slide 18

Let's take a simple example of the detail page for an event - like today's WSG meeting. If you look at the URI of the page, it tells us a lot of information.

Slide 19

Slide 19

http://upcoming.org/ event/105545 We have the domain name - upcoming.org - followed by the type of object we're asking for - an event - followed by the unique identifier for that event. The corresponding API call for the Upcoming.org API is event.getInfo.

Slide 20

Slide 20

event.getInfo The full HTTP GET for a request against the event.getInfo method looks like this

Slide 21

Slide 21

http://upcoming.org/ services/rest/ ?api_key= <API Key>

&method= event.getInfo &event_id=105545 Noticing any parallels? What we have here is essentially the same data request formatted in two di ! erent ways, and resulting in two di ! erent representations of the same data. The people of the internets call this a waste. So what if the same resource could be accessed and understood by both regular visitors and machines... One URL, accessing One resource, readable by BOTH types of visitor. Of course, our friends at Upcoming.org are friendly web citizens as well as smart, and have already added the hCalendar microformat to this page.

Slide 22

Slide 22

<div id="event" class=" vevent "> <h1 class="name summary "> WSG Meetup: Microformats </h1> <div id="eventMain"> <div id="eventMetadata"> <div class="small">When</div> <div class="date"> <abbr class=" dtstart " title=" 2006-10-19T18:30-07:00 "> Thursday, October 19, 2006</abbr> </div> <!-- /.date --> <div class="time">6:30 PM - 11:00 PM </div> <div class="venue location vcard "> <br /><div class="small">Where</div> <span class=" fn org "><a href="/venue/33942">New Cavendish Street campus of Westminster University</a></span><br /> <div class="address adr "> <span class=" street-address ">115 New Cavendish Street</span><br /> <span class=" locality ">London</span>, <span If we look at the source, we can see various familiar class names and code fragments that give us all the clues we need to identify an hCalendar event.

Slide 23

Slide 23

‘Tails’ for Firefox If we run this page through a microformats parser like Tails for Firefox, we can see it finds the data. Similarly, X2V will happily convert this page into an iCal file. If I'd written the hCalendar module for hKit yet, I'd be able to show you that too. It's in the lower-level parsers (the sort that talk between a page and the code you write rather than the page and a user) that the really useful power lies.

Slide 24

Slide 24

<?xml version="1.0" encoding="UTF-8"?> <rsp stat="ok" version="1.0"> <event id="105545"

name ="WSG Meetup: Microformats"
tags="microformats, web standards group, standards"

description ="This month is the WSG meetup is going to be all about Microformats and we have three speakers... <snip /> "

start_date ="2006-10-19" end_date=""

start_time ="18:30:00" end_time ="23:00:00"
personal="0" selfpromotion="0" metro_id ="49"

venue_id ="33942" user_id="73013" category_id="5"

url ="http://muffinresearch.co.uk/wsg/"
date_posted="2006-09-07" latitude ="" longitude =""
geocoding_precision="" geocoding_ambiguous="" /> </rsp> Comparing that to the output of the event.getInfo API call, we can see that we've successfully extracted pretty much the same data. This renders event.getInfo redundant.

Slide 25

Slide 25

event.getInfo event.getInfo is no longer needed, as the same data can be extracted from the page directly. So that's just a very simple example. Unless you've been asleep during every Web 2.0 presentation for the last year, you'll be aware that Flickr has a very powerful and complete API. Surely nothing we can do with microformats could touch the power and might of the Flickr API?

Slide 26

Slide 26

Let's look at the URI for my profile page: http://flickr.com/people/drewm/

Slide 27

Slide 27

http://flickr.com/ people/drewm Just like our example from Upcoming.org, we have the name of the service - flickr.com - the type of object we'd like to retrieve - a person - and the unique identifier for the person in question - the username.

Slide 28

Slide 28

flickr.people.getInfo The equivalent request from the Flickr API is flickr.people.getInfo.

Slide 29

Slide 29

http://api.flickr.com/ services/rest/ ?api_key= <API Key>

&method= flickr.people.getInfo

&user_id=87703047@N00 Except you'll notice that the user_id I've provider here isn't my username. That's because behind every friendly user identifier in the application, Flickr has a nasty numerical identifier which is guaranteed to be unique.

Slide 30

Slide 30

flickr.people.findByUsername Therefore, before I can make a call to flickr.people.getInfo, I need to find the user_id by the way of a call to flickr.people.findByUsername.

Slide 31

Slide 31

http://api.flickr.com/ services/rest/ ?api_key= <API Key>

&method= flickr.people.findByUsernam e

&username=drewm If we run our parsers over this, just as with the Upcoming.org events page...

Slide 32

Slide 32

Tails we can see Tails finds the data

Slide 33

Slide 33

BEGIN: VCARD

PRODID:-//suda.co.uk//X2V 0.8 (BETA)//EN SOURCE: http://flickr.com/people/drewm

NAME:Flickr: drewm

VERSION:3.0 N;CHARSET=UTF-8:McLellan;Drew;;; FN;CHARSET=UTF-8: Drew McLellan TITLE;CHARSET=UTF-8: Web Application Developer

NICKNAME;CHARSET=UTF-8:drewm ADR;CHARSET=UTF-8:;;; Maidenhead ;;; United Kingdom LOGO;VALUE=uri: http://static.flickr.com/14/ buddyicons/87703047@N00.jpg?1147807052

URL: http://www.allinthehead.com /

END:VCARD X2V X2V finds the data...

Slide 34

Slide 34

[0] => Array ( [fn] => Drew McLellan

[n] => Array ( [given-name] => Drew

[family-name] => 

McLellan

) [adr] => Array ( [country-name] => United Kingdom

[locality] => 

Maidenhead

) [nickname] => drewm

[logo] => http://static.flickr.com/14/buddyicons/ 87703047@N00.jpg?1147807052 [url] => http://www.allinthehead.com/

[title] => Web Application Developer

) hKit and hKit finds it too.

Slide 35

Slide 35

<?xml version="1.0" encoding="utf-8" ?> <rsp stat="ok"> <person id="87703047@N00" nsid="87703047@N00" isadmin="0" ispro="1" iconserver="14"> <username> drewm </username> <realname> Drew McLellan </realname> <mbox_sha1sum>2201f242d415d2daca2faa7bfb6da27bd476ea6b</ mbox_sha1sum> <location> Maidenhead, United Kingdom </location> <photosurl> http://www.flickr.com/photos/drewm/ </photosurl> <profileurl> http://www.flickr.com/people/drewm/ </profileurl> <mobileurl> http://www.flickr.com/mob/photostream.gne?id=199423 </ mobileurl> <photos> <firstdatetaken>2001-03-21 14:08:15</firstdatetaken> <firstdate>1118087290</firstdate> <count>772</count> </photos> </person> </rsp> Compare that output with the output from Flickr's own flickr.people.getInfo API call - the same stu ! 's there. (In fact, not quite as much info...)

Slide 36

Slide 36

flickr.people.getInfo Every user has a profile page, so this renders flickr.people.getInfo redundant.

Slide 37

Slide 37

So since we're working our way through Yahoo's family of Web 2.0 poster children, let's do the same with del.icio.us. The URI of my page listing bookmarks for the tag 'microformats' takes a slightly di ! erent, simpler form this time.

Slide 38

Slide 38

http://del.icio.us/ drewm/microformats We have the service name - del.icio.us - as the service deals with bookmarks way more than anything else, that's the default item for retrieval, so we go right in with my username - drewm - and then the identifier for the data we want - which in this case is the tag 'microformats'.

Slide 39

Slide 39

o_O Looking for the output in Tails - is disappointing, as del.icio.us doesn't have support for microformats yet. :-(

Slide 40

Slide 40

The good news is that Ma.gnolia does! So here's the same query from there.

Slide 41

Slide 41

http://ma.gnolia.com/ people/drewm/tags/ microformats

Slide 42

Slide 42

bookmarks_find The equivalent method from the Ma.gnolia API is bookmarks_find

Slide 43

Slide 43

http://ma.gnolia.com/api/ rest/1/ bookmarks_find ? api_key= <API Key>

&person=drewm &tags=microformats there’s the request

Slide 44

Slide 44

<?xml version="1.0" encoding="utf-8" ?> <response status="ok" version="1"> <bookmarks> <bookmark private="false" rating="0" updated="2006-10-16T13:14:47-07:00" id="volavufo" created="2006-10-16T13:14:45-07:00" owner=" drewm "> <title> microformats.org </title> <url> http://microformats.org / </url> <description></description> <screenshot> http://scst.srv.girafa.com/srv/i? i=sc010159&amp;r=microformats.org&amp;s=2347d22ba7d0ed72 </ screenshot> <tags> <tag name=" microformats "/> <tag name=" semantic web "/> </tags> </bookmark> </bookmarks> </response> and this is the output

Slide 45

Slide 45

Tails Here it is in Tails, using the xFolk microformat. hKit has preliminary support for this too.

Slide 46

Slide 46

Similarly, say I wanted to get reviews for a particular wine from corkd.com.

Slide 47

Slide 47

http://corkd.com/wine/ view/1122 Again we have the service name - corkd.com - the object we're looking for - wine - this time a sort of 'mode' indicator - view - and then the unique identifier for the wine. A human visitor gets the wine details and the reviews on that page. The equivalent API method for the corkd API is ...

Slide 48

Slide 48

o_O ... well, corkd has no separate API.

Slide 49

Slide 49

Fortunately, corkd is rich in microformats, including hReview. Running a parser over the wine detail page gives us all the data we need. Even without a formal API. Hah!

Slide 50

Slide 50

Tails That’s a whole list of reviews and ratings directly accessible from the page with no separate API. I asked Dan Cederholm - the front-end guy behind Cork’d about this choice - and he said...

Slide 51

Slide 51

“ I could tell you about Brian Suda emailing about some crazy XSLT/SPARQL stuff he did by scraping his drinking buddies, then running a search query and cross- referencing the hReviews with his XFN list. He was attempting to show how he could get a search result of “ trusted ” reviews all based on the microfomats we've implemented. I didn't have a clue as to what he was talking about. :-)

Slide 52

Slide 52

But that's the beauty of it! Something I'm calling “ oblivious development ”. I've always looked at microformats as “ planting seeds ” that later grow into things you never even thought of.
microformats are so easy to sprinkle in, that as designer I can plant the stuff that later

someone like Brian Suda can do insane things with. I love that. I don't understand the stuff that Brian was doing - but I don't have to. ” Dan Cederholm, Cork’d

Slide 53

Slide 53

This is a key di ! erence between publishing an API and letting your data just speak for itself. When designing an API, you have to consider what data users might need, and to an extent how they’re going to use it. Those decisions have already been made for your website as part of your information architecture, and that data is already being published. The only e ! ort involved is in more careful choice of markup when building the page.

Slide 54

Slide 54

read/write Of course, there’s more to APIs than just retrieving data. Whilst reads are the VAST majority of the tra " c to most common APIs, a lot of the concept, at least, is about writing data back too. Fortunately, that’s the easy part of the problem too. Sending data to a service is already in a structured format, as is already in a format specified by the provider. An HTTP POST is already works in this way by its nature. If we were to extend the idea, we could specify a common format for POSTing events or contact details, and indeed there’s already work happening on that within the microformats community.

Slide 55

Slide 55

Cook’d? So is what I’m suggesting ready for the prime time? Can you forget about building an API and rely on good semantics to get you through? For reads, yes, absolutely. Cork’d are doing it today. If you need to have a fully read/write environment, possibly not quite yet. It’s no but yeah but no.

Slide 56

Slide 56

drewmclellan .getInfo The real power is, however, perhaps in the places where we wouldn’t normally consider building an API. What if I wanted to make a request to drewmclellan.getInfo. Or drewmclellan.getEvents. There’s no such API, and I’d probably have to be very very sad to build one. (Hands up if you’ve done this!)

Slide 57

Slide 57

http://allinthehead.com/ about In fact, this already exists. Because I already publish things about myself on my own site, by using microformats, there’s an API for me.

Slide 58

Slide 58

<a href=” http://flickr.com/photos/drewm ” rel=” me ”

My photos</a> drewmclellan. getPhotos

Slide 59

Slide 59

<a href=” http://upcoming.org/user/38988 ” rel=” me ”

My events</a> drewmclellan. getEvents

Slide 60

Slide 60

<a href=” http://corkd.com/people/drewm ” rel=” me ”

My wine journal</a> drewmclellan. getWines

Slide 61

Slide 61

Can I add enough semantic information to the pages I already publish so that they could replace the function of a dedicated API?

Slide 62

Slide 62

Can I add enough semantic information to the pages I already publish so that I get an API thrown in for free?

Slide 63

Slide 63

Hell yeah. If you're a data publisher, all you need to do is put into practise some of the things Norm and Jeremy have been demonstrating tonight. Marking up your content with microformats is no only a good place to start, but the BEST place to start.

Slide 64

Slide 64

Can your website be your API?

Slide 65

Slide 65

Fo shizzle.

Slide 66

Slide 66

Thanks! http://allinthehead.com/presentations/2006/mf-api

Slide 67

Slide 67

Credits The following Creative Commons licensed images were used in this presentation: http://flickr.com/photos/adactio/169052553/ http://flickr.com/photos/tgraham/253500273/ http://flickr.com/photos/gabrielhl/76450732/ http://flickr.com/photos/mpdehaan/21006425/ http://flickr.com/photos/splorp/64027565/ http://flickr.com/photos/vampire_bear/15910260/ http://flickr.com/photos/agos/240924445/ http://flickr.com/photos/brook/65076098/ http://flickr.com/photos/shveckle/204895620/ http://flickr.com/photos/poagao/23805079/ http://flickr.com/photos/z1784/69981580/ http://flickr.com/photos/johnnyhuh/812894/ http://flickr.com/photos/gperez/4393118/ http://flickr.com/photos/isphoto/54113178/ http://flickr.com/photos/flashmaggie/6271604/ http://flickr.com/photos/rachelandrew/169006965/ http://flickr.com/photos/scatti_frullati/156505041/ http://flickr.com/photos/thedepartment/137413905/ http://flickr.com/photos/camera_rwanda/265802151/ http://flickr.com/photos/esther17/171786999/ http://flickr.com/photos/ianlloyd/264755178/