w3c Annotations meeting

Randall Leeds:

Annotations are additions - links, block quotes, inline - microformats and microdata; out of band webmention + pingback

we want annotations to be decentralized, so they need to interoperate with a data model and component models

We are going to need syndication and discovery for annotations

we want a more read/write web - we deserve the freedom to reference, not overload URI syntax forever

how do we go from out of band to inline? How do you show a remote annotation in the text you're reading

existing event targets are DOM element, but we want to target the data -the text of the page idea: event.dataTarget

how about a selection pseudo element - can we describe a selection, style it and not modify the DOM - like a:visited

we need feed discovery - Activity Streams. Also we need to publish locally and use WebMention http://indiewebcamp.com/webmention

Chris Gallello:

I'm PM on Office Online, working on accessibility - we have annotations in Office and visual studio already

red underlines in Word and breakpoints in Visual Studio are annotations form our point of view [machine generated]

we've been thinking about how to connect screen readers to annotations - to know there is one, to jump in and out

we can use an aria.annotationtype of "comment" and point to the element with the comment with aria.annotatedby

Anna Gerber:

Annotation is used throughout the research process, and for teaching not just as part of publications

researchers need to annotate maps, 3D spaces, protein models, sensor data streams , textual variants

scholary annotation requirements: citable; precise; of segments/regions; many data types; within dynamic web apps

we need to migrate annotations across multiple copies of identical resources, or modified ones

stand-off annotation is needed to maintain integrity of original resources

we use the Annotea, OA model so have shared generic backend, but need protocols and APIs too

the OA model is extremely flexible, but diffciult to implement as the queries are hard to write

Sean Boisen:

Logos is a digital library for biblical studies, with very rich annotation and cross referencing tools

our app is not out on the web, but a desktop app you download.

Bible has book/chapter/verse and we add word, but they are different between some bibles

we care about cross-lingual word alignment for annotations, which requires extra standardizations

Nick Stenning:

data models are simple; protocols harder, but user interactions are hard enough that we'd take us years to agree

a bookmarklet is basically a cross-site scripting attack. A standard known as content security policy breaks them

Browser extensions are not standardised - every user agent has its own extension model

Does anyone know for proposals to allow user-trusted code to run in the DOM in a standard way? Nope

we either need to spec ALL the things, or build for pluralism so there are many ways to do it

James Williamson:

As a publisher, having notes you take on Kindle not translating to ibooks is really annoying

we get complaints from readers all the time about the "1000 people marked this page" stuff in Kindle

we do pop-ups of footnotes and publication history in our online journals

we have third party annotations from reference management and social media crawling

the only way to annotate our works is with the "contact us" button that emails the publisher- can take 2 years

we have multiple different annotations at different levels of the publishing process paper/Word/PDF

we have extensive annotations in our new journals online product , and for assignments and quizzes

what happens to annotations when content has been deleted? Or out of print? [odd concept for web]

Frederick Hirsch:

people annotate books but also movies and sound too where you have to point at timestamps

teachers comment on student assignments and students can respond inline

provenance of annotations is important - this brings in identity issues. Iterating them is also hard

I don't think we should rework the Open Annotation data model, but we need RESTful search and JSON-LD output

Anna Gerber:

we have covered the data model but not the protocol.

Timothy Cole:

for scholarly text we need individual words and phrases as anchors, even when adjacent content is updated

correction of OCR and manual transcriptions, and of automated part-of-speech tagging

we need proposed corrections to be able to be reviewed and annotated themselves

for example we annotate the scan with the original OCR and the OCR with the corrections

Eric Aubourg:

STM is a small publisher with complex texts including arabic, hebrew and heiroglyphics

our policy is ePub firts, no DRM, one purchase for all formats

we have ePub with interactive maps of the Karnak temple that link to images of the wall paintings

referencing other works is required for scholarly publications, but page number doesn't work across editions

for epub we need something user-raedable and reader-processable - people like "page 23" not ids

we need it to be independent of paper, pdf, epub that can survive reflowing, and human readable

in epub you can mark the page boundaries of the original document in the HTML but doesn't go the other way

numbered paragraphs within chapters can work for finished works - easy to quote

we need readable shortcuts, not 64-character hashes - like link shorteners

we want the target refernce to be human processable

Kevin Marks:

q: isn't a quotation from the work the most robust reference across paper+ edocs 10 words are unique?

Eric Aubourg:

yes, quotation is good and robust, and human readable, but it can be a bit long

Fred Chasen:

the interactions in the various different document viewers are cumbersome and inconsistent

creating the notes content first and then anchoring to the document makes more sense

robust note authoring shouldn't block what you are annotating. It should let you re-anchor after composition

we need to account for longer notes, that may be longer than the entire text

also define print styles for CSS so that users can print out the whole thing

Kristof Csillag:

at Hypothes.is we're working on an annotation system for the web and we have proposed solutions

annotating web documentsnis good, but we added PDF, Epub and would like to add scribd and google docs too

we define a target with generic selectors xpath, but also text position and text content selectors

by having multiple selectors we can use fuzzy matching to find the parts we want

there is a problem with dynamic sites, we need dynamic anchoring - comments can be oprhaned

knowing what is actually the target across documents can be very hard

Kevin Marks:

if you remove annotations when the document referenced is edited can't unfavourable ones be removed?

Kristof Csillag:

yes, but we can keep orphaned annotations and possibly keep old versions

Anna Gerber:

how do you cope with copyright issues of keeping quotations?

Kristof Csillag:

right now we don't care about copyright - we are focused on making the annotations robust

Nick Stenning:

there are difficult problems around how you display disappeared content to the user, and how to reanchor

Tantek Çelik:

the anchoring problem is important - cool URLs don't change

question the assumption of annotation providers, how about self-hosted annotation, anchor and context too

Kevin Marks:

we should standardize multiple anchor formats - link, cite, text, image, audio snippet rather than how to be fuzzy

Anna Gerber:

when thinking about anchoring, focus on the content, not just the document format

Robert Casties:

At the Max Planck Institute for the History of Science, we have historical sources in many forms

we want to weave a web of knowledge as Jürgen Renn says

when you can collect of all the annotations on a source you get a semantic network of the source

we want to annotate images in a resolution independent way, we also want to show relationships and provenance

we want more complex co-ordinates and representations [why not use http://dev.w3.org/html5/spec-preview/image-maps.html] ?

we could use GeoJSON to point to images how would we add this as selectors?

Raquel Alegre:

at University of Reading, I work on CHARMe which annotates climate datasets

Scientists get huge numbers of options for data from data providers, but the annotations aren't lined

A climate Dataset may be a table, a time series, a map a 3d model or an animation

climate data users reserach timing, specific areas of the world, and comparing datasets

climate data comes in 2d, 3d and 4d formats - layers of images at different res or sensors

we need ways to point to space and time -we have geolocation and time units for most of this

Gregg Kellogg:

how can I use an API without coding for it? Define operations on classes and properties

annotations are the results of operations acting on entities

Jason Haag:

I'm from the IEEE Learning Technology Standards Committee and we are working on storage APIs

our Tin Can API is based on activitystrea.ms - Actor, Verb, Object

the xAPI records learning experiences using activites

IEEE wants to use EPUB3 as sustainable format for technical content, with action tracking by xAPI

Jake Hartnell:

I'm a science fiction writer - heres a shameless plug for my book a 23rd century romance

in 2018, web annotation will be implemented in the browser we can refer to anything

the annotation document needs to be stored somewhere - think of it as channels

the browser queries all the channels the user subscribes to, kinda like rss feeds and they load in a sidebar

Annotation is a kind of advanced linking

the browser should provide a space for these attached documents to live and be viewed

Gerardo Capiel:

annotation is a powerful tool for accessibility of non-textual content when authors forget to put it in

images tend to lack proper descriptions and mathematics is often done as images, not MathML

Video description has even less support

today, Blind and vision impaired students get support by others annotating video and images

we need unified standards for annotation so that the efforts of people who do accessible annotation this can spread

Puneet Kishor:

Copyright is a rats nest. I'm not a lawyer, I want to avoid unleashing the rats. I work at Creative Commons

our job at Creative Commons is to keep this unfettered by the law

annotations may not start out with enough original material to be copyrightable, but could grow into cliffs notes

every annotation should carry the information with it to determine legal status [presumably cc license]

people can assert what they want in the way of attribution and commercial use with CC license

our latest version of Creative Commons, CC4 will cover database licences too, Should be stable to use now

Creative Commons don't restrict people, they enable people. That's always the goal.

you can only licence what you create, not someone else's stuff. The snippets [anchors] should be fair use

copyright attaches to original authorship fixed in a tangible medium - CC lets you disclaim

Tantek Çelik:

both APA and MLA citation styles for tweets include the entire text of the tweet - they don't mention licensing

webmentions