Upcoming U.S. and European events related to Akoma Ntoso

In my last blog post I covered the public review of the new proposed Akoma Ntoso (LegalDocML) standard for legal documents. Please keep the comments coming. In order to comment, please send email to legaldocml-comment@lists.oasis-open.org. If you wish to subscribe to this mailing list, please follow the instructions at https://www.oasis-open.org/committees/comments/index.php?wg_abbrev=legaldocml

In addition, there are three upcoming events related to Akoma Ntoso which you may wish to participate in: (this list coming from Monica Palmirani, the chair of the OASIS LegalDocML technical committee)

1. Akoma Ntoso Summer School, 27-31 July, 2015, George Mason University, Fairfax, Virginia (USA): http://aknschool.cirsfid.unibo.it
Registration fee: http://aknschool.cirsfid.unibo.it/logistics/registrations-and-fees/
Application Form: http://aknschool.cirsfid.unibo.it/wp-content/uploads/2015/05/ApplicationForm.pdf
Brochure:
http://aknschool.cirsfid.unibo.it/wp-content/uploads/2015/05/brochure_2015_US_DEF.pdf
Deadline: end of June, 2015.

2. IANC2015 (First International Akoma Ntoso Conference): August 1st, 2015, George Mason University, Fairfax, Virginia (USA)
Brochure: http://aknschool.cirsfid.unibo.it/wp-content/uploads/2015/05/AKN-CONFERENCE1.pdf
Call for contributions:
http://www.akomantoso.org/akoma-ntoso-conference/call-for-contributions/
Deadline: June 19th, 2015.

3. Summer School LEX2015, 7-15 Sept. 2015, Ravenna, Italy: http://summerschoollex.cirsfid.unibo.it
Registration fee: http://summerschoollex.cirsfid.unibo.it/?page_id=66
Application Form: http://summerschoollex.cirsfid.unibo.it/wp-content/uploads/2010/04/ApplicationForm2.pdf
Brochure:
http://summerschoollex.cirsfid.unibo.it/wp-content/uploads/2015/05/brochure_2015_LEX1.pdf
Deadline: July, 15th, 2015.

I have been participating in the European LEX Summer school every year since 2010 and find it to be both inspirational and very valuable. If you’re interested in understanding where the legal informatics field is headed, I encourage you to find a way to attend any of these events. I will be speaking/teaching at all three events.

Upcoming U.S. and European events related to Akoma Ntoso

Akoma Ntoso (LegalDocML) is now available for public review

It’s been many years in the making, but the standardised version of Akoma Ntoso is now finally in public review. You can find the official announcement here. The public review started May 7th and will end on June 5th — which is quite a short time for something so complex.

I would like to encourage everything to take part in this review process, as short as it is. It’s important that we get good coverage from around the world to ensure that any use cases we missed get due consideration. Instructions for how to comment can be found here.

Akoma Ntoso is a complex standard and it has many parts. If you’re new to Akoma Ntoso, it will probably be quite overwhelmed. To try and cut through that complexity, I’m going to try and give a bit of an overview of what the documentation covers, and what to be looking for.

There are four primary documents

  1. Akoma Ntoso Version 1.0 Part 1: XML Vocabulary — This document is the best place to start. It’s an overview of Akoma Ntoso and describes what all the pieces are and how they fit together.
  2. Akoma Ntoso Version 1.0 Part 2: Specifications — This is the reference material. When you want to know something specific about an Akoma Ntoso XML element or attribute, this is the document to go to. In contains very detailed information derived from the schema itself. Also included with this is the XML schema (or DTD if you’re still inclined to use DTDs). and a good set of examples from around the world.
  3. Akoma Ntoso Naming Convention Version 1.0. This document describes two very interrelated and important aspects of the proposed standard — how identifitiers are assigned to elements and how IRI-based (or URI-based) references are formed. There is a lot of complexity in this topic and it was the subject to numerous meetings and an interesting debate at the Coco Loco restaurant in Ravenna, Italy, one evening while being eaten by mosquitoes.
  4. Akoma Ntoso Media Type Version 1.0 — This fourth document describes a proposed new media type that will be used when transmitting Akoma Ntoso documents.

This is a lot of information to read and digest in a very short amount of time. In my opinion, the best way to try and evaluate Akoma Ntoso’s applicability to your jurisdiction is as follows:

  • First, look at the basic set of tags used to define the document hierarchy. Is this set of tags adequate. Keep in mind that the terminology might not always perfectly align with your terminology. We had to find a neutral terminology that would allow us to define a super-set of the concepts found throughout the world.
  • If you do find that specific elements you need are missing, consider whether or not that concept is perhaps specific to your jurisdiction. If that is the case, take a look at the basic Akoma Ntoso building blocks that are provided. While we tried to provide a comprehensive set of elements and attributes, there are many situations which are simply too esoteric to justify the additional tag bloat in the basic standard. Can the building blocks be used to model those concepts?
  • Take a look at the identifiers and the referencing specification. These parts are intended to work together to allow you to identifier and access any provision in an Akoma Ntoso document. Are all your possible needs met with this? Implicit in this design is a resolver architecture — a component that parses IRI references (think of them as URLs) and maps to specific provisions. Is this approach workable?
  • Take a look at the basic metadata requirements. Akoma Ntoso has a sophisticated metadata methodology behind it and this involves quite a bit of indirection at times. Understand what the basic metadata needs are and how you would model your jurisdictions metadata using this.
  • Finally, if you have time, take a look at the more advanced aspects of Akoma Ntoso. Consider how information related to the documents lifecycle and workflow might be modeled within the metadata. Consider your change management needs and whether or not the change management capabilities of Akoma Ntoso could be adapted to fit. If you work with complex composite documents, take a look at the mechanisms Akoma Ntoso provides to assemble composite documents.

Yes, there is a lot to digest in just a few weeks. Please provide whatever feedback you can.

We’re also now in the planning stages for a US LEX Summer School. If you’ve followed my blog over the years, you’ll know that I am a huge fan of the LEX Summer School in Ravenna, Italy — I’ve been every year for the past five years. This year, Kirsten Gullikson and I convinced Monica and Fabio to bring the Summer School to Washington D.C. as well. The summer school will be held the last week of July 2015 at George Mason University. The class size will be limited to just 30, so be sure to register early once registration opens. If you want to hear me rattle on at length about this subject, this is the place to go — I’ll be one of the teachers. The Summer School will conclude with a one day Akoma Ntoso Conference on the Saturday. We’ll be looking for papers. I’ll send out a blog with additional information as soon as it’s finalized.

You may have noticed that I’ve been blogging a lot less lately. Well, that’s because I’ve been heads down for quite some time. We’ll soon be in a position to announce our first full Akoma Ntoso product. It’s an all new web-based XML editor that builds on our experiences with the HTML5 based AKN/Editor (LegisPro Web) that we built before.

This editor is composed of four main parts.

  1. First, there is a full XML editing component that works with pure XML — allowing it to be quite scalable and very XML precise. It implements complex track changes capabilities along with full redo/undo. I’m quite thrilled how it has turned out. I’ve battled for years with XMetaL’s limitations and this was my opportunity to properly engineer a modern XML editor.
  2. Second, there is a sophisticated resolver technology which acts as the middleware, implementing the URI scheme I mentioned earlier — and interfacing with local and remote document resources. All local document resources are managed within an eXist-db repository.
  3. Third, there is the Akoma Ntoso model. The XML editing component is quite schema/model independent. This allows it to be used with a wide variety of structured documents. The Akoma Ntoso model adapts the editor for use with Akoma Ntoso documents.
  4. And finally, there is a very componentised application which ties all the pieces together. This application is written as an AngularJS-based single page application (SPA). In an upcoming blog I’ll detail the trials and tribulations of learning AngularJS. While learning AngularJS has left me thinking I’m quite stupid at times, the goal has been to build an application that can easily be extended to fit a wide variety of structured editing needs. It’s important that all the pieces be defined as modules that can either be swapped out for bespoke implementations or complemented with additional capabilities.

Our current aim is to have the beta version of this new editor available in time for the Summer School and Akoma Ntoso conference — so I’ll be very heads down through most of the summer.

Akoma Ntoso (LegalDocML) is now available for public review

Achieving Five Star Open Data

A couple weeks ago, I was in Ravenna, Italy at the LEX Summer School and follow-on Developer’s Workshop. There, the topic of a semantic web came up a lot. Despite cooling in the popular press in recent years, I’m still a big believer in the idea. The problem with the semantic web is that few people actually get it. At this point, it’s such an abstract idea that people invariably jump to the closest analog available today and mistake it for that.

Tim Berners-Lee (@timberners_lee), the inventor of the web and a big proponent of linked data, has suggested a five star deployment scheme for achieving open data — and what ultimately will be a semantic web. His chart can be thought of as a roadmap for how to get there.

Take a look at today’s Data.gov website. Everybody knows the problem with it — it’s a pretty wrapper around a dumping ground of open data. There are thousands and thousands of data sets available on a wide range of interesting topics. But, there is no unifying data model behind all these data dumps. Sometimes you’re directed to another pretty website that, while well-intentioned, hides the real information behind the decorations. Sometimes you can get a simple text file. If you’re lucky, you might even find the information in some more structured format such as a spreadsheet or XML file. Without any unifying model and with much of the data intended as downloads rather than as an information service, this is really still Tim’s first star of open data — even though some of the data is provided as spreadsheets or open data formats. It’s a good start, but there’s an awful long way to go.

So let’s imagine that a better solution is desired, providing information services, but keeping it all modest by using off-the-shelf technology that everyone is familiar with. Imagine that someone with the authority to do so, takes the initiative to mandate that henceforth, all government data will be produced as Excel spreadsheets. Every memo, report, regulation, piece of legislation, form that citizens fill out, and even the U.S. Code will be kept in Excel spreadsheets. Yes, you need to suspend disbelief to imagine this — the complications that would result would be incredibly tough to solve. But, imagine that all those hurdles were magically overcome.

What would it mean if all government information was stored as spreadsheets? What would be possible if all that information was available throughout the government in predictable and permanent locations? Let’s call the system that would result the Government Information Storehouse – a giant information repository for information regularized as Excel spreadsheets. (BTW, this would be the future of government publishing once paper and PDFs have become relics of the past.)

How would this information look? Think about a piece of legislation, for instance. Each section of the bill might be modeled as a single row in the spreadsheet. Every provision in that section would be it’s own spreadsheet cell (ignoring hierarchical considerations, etc.) Citations would turn into cell references or cell range references. Amending formulas, such as “Section 1234 of Title 10 is amended by…” could be expressed as a literal formula — a spreadsheet formula. It would refer to the specific cell in the appropriate U.S. Code Title and contain programmatic instructions for how to perform the amendment. In short, lots of once complex operations could be automated very efficiently and very precisely. Having the power to turn all government information into a giant spreadsheet has a certain appeal — even if it requires quite a stretch of the imagination.

Now imagine what it would mean if selected parts of this information were available to the public as these spreadsheets – in a regularized and permanent way — say Data.gov 2.0 or perhaps, more accurately, as Info.gov. Think of all the spreadsheet applications that would be built to tease out knowledge from the information that the government is providing through their information portal. Having the ability to programmatically monitor the government without having to resort to complex measures to extract the information would truly enable transparency.

At this point, while the linkages and information services give us some of the attributes of Tim’s four and five star open data solutions, but our focus on spreadsheet technology has left us with a less than desirable two star system. Besides, we all know that having the government publish everything as Excel spreadsheets is absurd. Not everything fits conveniently into a spreadsheet table to say nothing of the scalability problems that would result. I wouldn’t even want to try putting Title 42 of the U.S. Code into an Excel spreadsheet. So how do we really go about achieving this sort of open data and the efficiencies it enables — both inside and outside of government?

In order to realize true four and five star solutions, we need to quickly move on to fulfilling all the parts of Tim’s five star chart. In his chart, a three star solution replaces Excel spreadsheets with an open data format such as a comma separated file. I don’t actually care for this ordering because it sacrifices much to achieve the goal of having neutral file formats — so lets move on to full four and five star solutions. To get there, we need to become proficient in the open standards that exist and we must strive to create ones where they’re missing. That’s why we work so hard on the OASIS efforts to develop Akoma Ntoso and citations into standards for legal documents. And when we start producing real information services, we must ensure that the linkages in the information (those links and formulas I wrote about earlier), exist to the best extent possible. It shouldn’t be up to the consumer to figure out how a provision in a bill relates to a line item in some budget somewhere else — that linkage should be established from the get-go.

We’re working on a number of core pieces of technology to enable this vision and get to full five star open data. We integrating XML repositories and SQL databases into our architectures to give us the information storehouse I mentioned earlier. We’re building resolver technology that allows us to create and manage permanent linkages. These linkages can be as simple as citation references or as complex as instructions to extract from or make modifications to other information sources. Think of our resolver technology as akin to the engine in Excel than handles cell or range references, arithmetic formulas, and database lookups. And finally, we’re building editors that will resemble word processors in usage, but will allow complex sets of information to be authored and later modified. These editors will have many of the sophisticated capabilities such as track changes that you might see in a modern word processor, but underneath you will find a complex structured model rather than the ad hoc data structures of a word processor.

Building truly open data is going to be a challenging but exciting journey. The solutions that are in place today are a very primitive first step. Many new standards and technologies still need to be developed. But, we’re well on our way.

Achieving Five Star Open Data

2014 LEX Summer School & Developer’s Workshop

This week I attended the 2014 LEX Summer School and the follow-on Developer’s Workshop put on by the University of Bologna in Ravenna, Italy. This is the fifth year that I have participated and the third year that we have had the developer’s extension.

It’s always interesting to me to see how the summer school has evolved from the last year and who attends. As always, the primary participation comes from Europe – as one would expect. But this year’s participants also came from as far away as the U.S., Chile, Taiwan, and Kenya. The United States had a participant from the U.S. House of Representatives this year, aside from me. In past years, we have also had U.S. participation from the Library of Congress, Lexus Nexus, and, of course, Xcential. But, I’m always disappointed that there isn’t greater U.S. participation. Why is this? It seems that this is a field where the U.S. chooses to lag behind. Perhaps most jurisdictions in the U.S. are still hoping that Open Office or Microsoft Office will be a good solution. In Europe, the legal informatics field is looking beyond office productivity tools towards all the other capabilities enabled by drafting in XML — and looking forward to a standardized model as a basis for a more cost effective and innovative industry.

As I already mentioned, this was our third developer’s workshop. It immediately followed the summer school. This year the developer’s workshop was quite excellent. The closest thing I can think of in the U.S. is NALIT, which I find to be more of a marketing-oriented show and tell. This, by comparison, is a far more cozy venue. We sit around, in a classroom setting, and have a very open and frank share and discuss meeting. Perhaps it’s because we’ve come to know one another through the years, but the discussion this year was very good and helpful.

We had presentations from the University of Bologna, the Italian Senate, the European Parliament, the European Commission, the UK National Archives, the US House of Representatives, and myself representing the work we are doing both in general and for the US House of Representatives. We closed out the session with a remote presentation from Jim Mangiafico on the work he is doing translating to Akoma Ntoso for the UK National Archives. (Jim, if you don’t already know, was the winner of the Library of Congress’ Akoma Ntoso challenge earlier this year.)

What struck me this year is how our shared experiences are influencing all our projects. There has been a marked convergence in our various projects over the last year. We all now talk about URI referencing schemes, resolvers to handle them, and web-based editors to draft legislation. And, much to my delight, this was the first year that I’m not the only one looking into change tracking. Everybody is learning that differencing isn’t always the best way to compute amendments – often you need to better craft how the changes are recorded.

I can’t wait to see the progress we make by this time next year. By then, I’m hoping that Akoma Ntoso will be well established as a standard and the first generation of tools will have started to mature. Hopefully our discussion will have evolved from how to build tools towards how to achieve higher levels of compliance with the standard.

I also hope that we will have greater participation from the U.S.

2014 LEX Summer School & Developer’s Workshop

Look how far legal informatics has come – in just a few years

Back in 2001 when I started in the legal informatics field, it seemed we were all alone. Certainly, we weren’t – there were many similar efforts underway around the country and around the world. But, we felt alone. All the efforts were working in isolation – making similar decisions and learning similar lessons. This was the state of the field, for the most part, for the next 6 to 8 years. Lots of isolated progress, but few opportunities to share what we had learned and build on what others knew.

In 2010, I visited the LEX Summer School, put on by the University of Bologna in Ravenna, Italy. What became apparent to me was just how isolated the various pockets of innovation had become around the world. There was lots of progress, especially in Europe, but legal informatics, as an industry, was still in a fledgling state – it was more of an academic field than a commercial industry. In fact, outside of academic circles, the term legal informatics was all but meaningless. When I wrote my first blog in 2011, I looked forward to the day when the might be a true Legal Informatics industry.

Now, just a few years later, it’s stunning how far we have come. Certainly, we still have far to travel, but now we’re all working together towards common goals rather than working alone towards the same, but isolated, goals. I thought I would spend this week’s blog to review just how far we have come.

  1. Working together
    We have come together in a number of important dimensions:

    • First of all, consider geography. This is a small field, but around the world we’re now all very much connected together. We routinely meet, share ideas, share lessons, and share expertise – no matter which continent we work and reside on.
    • Secondly, consider our viewpoints. There was once a real tension between the transparency camp, government, external industry, and academia. If you participated at the 2014 Legislative Data and Transparency conference a few weeks ago in Washington D.C., one of the striking things was how little tension remains between these various viewpoints. We’re all now working towards a common set of goals.
  2. Technology
    • I remember when we used to question whether XML was the right technology. The alternatives were to use Open Office or Microsoft Office, basing the legislative framework around office productivity tools. Others even proposed using relational database technology along with a forms-based interface. Those ideas have now generally faded away – XML is the clear choice. And faking XML by relying on the fact that the Open Document Format (ODF) or Office Open XML formats are based on XML, just isn’t credible anymore. XML means more than just relying on an internal file format that your tools happen to use – it means designing information models specifically to solve the challenges of legal informatics.
    • I remember when we used to debate how references should be managed. Should we use file paths? Should we use URNs? Should we use URLs? Today the answer is clear – we’re all settling around logical URLs with resolvers, sometimes federated, to stitch together a web of interconnected references. Along with this decision has been the basic assumption that web-based solutions are the future – desktop applications no longer have a place in a modern solution.
    • Consider database technology. We used to have three choices – use the file system, try and adapt mature but ill-fitting relational databases, or take a risk with emerging XML databases. Clearly XML databases were the future – but was it too early? Not anymore! XML database technology, along with XQuery, have come a long way in the past few years
  3. Standards
    Standards are what will create an industry. Without them, there is little opportunity to re-use – a necessary part of allowing cost-effective products to be built. After a few false starts over the years, we’re now on the cusp of having industry standards to work with . The OASIS LegalDocML (Akoma Ntoso) and LegalCiteM technical committees are hard at work on developing those standards. Certainly, it will be a number of years before we will see the all the benefits of these standards, but as they come to fruition, a real industry can emerge.
  4. Driving Forces
    Ten years ago, the motivation for using XML was to replace outdated drafting systems, often cobbled together on obsolete mainframes, that sorely needed replacement. The needs were all internal. Now, that has all changed. The end result is no longer a paper document which can be ordered from the “Bill Room” found in the basement of the Capitol building. It’s often not even a PDF rendition of that document. The new end result is information which needs to be shared in a timely and open way in order to achieve the modern transparency objectives, like the DATA Act, that have been mandated. This change in expectations is going to revolutionize how the public works with their representatives to ensure fair and open government.
  5. In the past dozen years, things sure have changed. Credit must be given to the Monica Palmirani (@MonicaPalmirani) and Fabio Vitali at the University of Bologna – an awful lot of the progress pivots around their initiatives. However, we’ve all played a part in creating an open, creative, and cooperative environment for Legal Informatics to thrive as more than just an academic curiosity – as a true industry with many participants working collaboratively and competitively to innovate and solve the challenges ahead.

Look how far legal informatics has come – in just a few years

Imagining Government Data in the 21st Century

After the 2014 Legislative Data and Transparency conference, I came away both encouraged and a little worried. I’m encouraged by the vast amount of progress we have seen in the past year, but at the same time a little concerned by how disjointed some of the initiatives seem to be. I would rather see new mandates forcing existing systems to be rethought rather than causing additional systems to be created – which can get very costly over time. But, it’s all still the Wild Wild West of computing.

What I want to do with my blog this week is try and define what I believe transparency is all about:

  1. The data must be available. First and foremost, the most important thing is that the data be provided at the very least – somehow, anyhow.
  2. The data must be provided in such a way that it is accessible and understandable by the widest possible audience. This means providing data formats that can be read by ubiquitous tools and, ensuring the coding necessary to support all types of readers including those with disabilities.
  3. The data must be provided in such a way that it should be easy for a computer to digest and analyze. This means using data formats that are easily parsed by a computer (not PDF, please!!!) and using data models that are comprehensible to widest possible audience of data analysts. Data formats that are difficult to parse or complex to understand should be discouraged. A transparent data format should not limit the visibility of the data to only those with very specialized tools or expertise.
  4. The data provided must be useful. This means that the most important characteristics of the data must be described in ways that allow it to be interpreted by a computer without too much work. For instance, important entities described by the data should be marked in ways that are easily found and characterized – preferably using broadly accepted open standards.
  5. The data must be easy to find. This means that the location at which data resides should be predictable, understandable, permanent, and reliable. It should reflect the nature of the data rather than the implementation of the website serving the data. URLs should be designed rather than simply fallout from the implementation.
  6. The data should be as raw as possible – but still comprehensible. This means that the data should have undergone as little processing as possible. The more that data is transformed, interpreted, or rearranged, the less like the original data it becomes. Processing data invariably damages its integrity – whether intentional or unintentional. There will always be some degree of healthy mistrust in data that has been over-processed.
  7. The data should be interactive. This means that it should be possible to search the data at its source – through both simple text search and more sophisticated data queries. It also means that whenever data is published, there should be an opportunity for the consumer to respond back – be it simple feedback, a formal request for change, or some other type of two way interaction.

How can this all be achieved for legislative data? This is the problem we are working to solve. We’re taking a holistic approach by designing data models that are both easy to understand and can be applied throughout the data life cycle. We’re striving to limit data transformations by designing our data models to present data in ways that are both understandable to humans and computers alike. We are defining URL schemes that are well thought out and could last for as long as URLs are how we find data in the digital era. We’re defining database solutions that allow data to not only be downloaded, but also searched and queried in place. We’re building tools that will allow the data to not only be created but also interacted with later. And finally, we’re working with standards bodies such as the LegalDocML and LegalCiteM technical committees at OASIS to ensure well thought out world wide standards such as Akoma Ntoso.

Take a look at Title 1 of the U.S. Code. If you’re using a reasonably modern web browser, you will notice that this data is very readable and understandable – its meant to be read by a human. Right click with the mouse and view the source. This is the USLM format that was released a year ago. If you’re familiar with the structure of the U.S. Code and you’re reasonably XML savvy, you should feel at ease with the data format. It’s meant to be understandable to both humans and to computer programs trying to analyze it. The objective here is to provide a single simple data model that is used from initial drafting all the way through publishing and beyond. Rather than transforming the XML into PDF and HTML forms, the XML format can be rendered into a readable form using Cascading Style Sheets (CSS). Modern XML repositories such as eXist allow documents such as this to be queried as easily as you would query a table in a relational database – using a query language called XQuery.

This is what we are doing – within the umbrella of legislative data. It’s a start, but ultimately there is a need for a broader solution. My hope is that government agencies will be able to come together under a common vision for our information should be created, published, and disseminated – in order to fulfill their evolving transparency mandates efficiently. As government agencies replace old systems with new systems, they should design around a common open framework for transparent data rather building new systems in the exact same footprint as the old systems that they demolish. The digital era and transparency mandates that have come with it demand new thinking far different than the thinking of the paper era which is now drawing to a close. If this can be achieved, then true data transparency can be achieved.

Imagining Government Data in the 21st Century

Building a browser-based XML Editor

Don’t forget the 2014 U.S. House Legislative Data and Transparency Conference this week.

I’m now hard at work on our second generation web-based XML editor. In my blog last week, I talked about the need for and complexities of change tracking in a legislative editor. In this blog, I want to describe more of the overall motivation.

A couple years ago, we built an HTML5-based legislative editor for Akoma Ntoso. We learned a lot from the effort and had some success with a couple customers whose needs matched the capabilities of the editor. The editor was built to use and exploit, to the fullest extent, many of the new APIs added to modern browsers to support HTML5. We found that, by focusing on HTML5, a lot of the complexities of dealing with browser quirks and incompatibilities were a thing of the past – allowing us to focus on building the editing functions.

The editor worked by transforming the XML document into a close representation of the XML, expressed as HTML5 tags. Using HTML5 features such as the @contenteditable attribute along with modern CSS, the browser DOM, selection ranges, drag and drop, and a WebDAV repository API, we were able to implement a fairly sophisticated web-based legislative editor.

But, not everything went smoothly. The first problem involved the complexity of mapping all the intricacies of XML into an HTML5 representation, and then maintaining that representation in the browser. Part of the difficulty stems from the fact the HTML5 is not specifically an XML dialect – and browsers tend to do HTML5 things that aren’t always XML friendly. The HTML5 DOM is deliberately rather loose and forgiving (it’s a big part of why HTML was successful in the first place) while XML demands a very precise and rigid DOM.

The second problem we faced was scalability. While the HTML5 representation wasn’t all that heavyweight, the bigger problem was the transformation cost going back and forth between HTML5 and XML. We sometimes deal with very large legislation and laws. In our bigger cases, the cost of transformation was simply unreasonable.

So what is the solution? Well, early last year we started experimenting with using a browser to render XML documents with a CSS directly – without any transform into HTML. Most modern browsers now do this very well. For the most part, we were able to achieve an acceptable rendition in the browser without any transformation.

There were a few drawbacks to this approach. For one, links were dead – they didn’t inherently do anything. Likewise, implementing something like the HTML @style attribute didn’t just naturally work. Before we could entertain the notion of a pure XML-based editor built within the XML infrastructure in the browser, we had to find a solution that would allow us to enrich the XML sufficiently to allow it to behave like an HTML page.

Another problem arose in that our prior web-based editor relied upon the @contenteditable feature of HTML. That is an HTML feature rather than a browser feature. Using XML as our base environment, we no longer had access to this facility. This wasn’t a total loss as our need for a rich change tracking environment required us to find a better approach that @contenteditable offered anyway.

With solutions to the major problems behind us, we started to take a look at the other goals for the editor:

  • Track Changes – This was the subject of my blog last week. For us, track changes is crucial in any editor targeted at legislation – and it must work at both the structural and textual level equally well. We use the feature for two things – redlining changes as is common in the U.S. and the automatic generation of amendment documents (amendments in context). Differencing can get you part way there – but it excludes the ability to adequately craft the changes in a way that deal with political sensitivities. Track Changes is a very complex feature which must be built into the very core of the editor – tacking it on later will be very difficult, if not impossible.
  • Scalability – Scalability is very important to our applications. We need to support very large documents. Even when we deal with document fragments, we need to allow those fragments to be very large. Our approach is to create editing islands within a large document loaded into the browser. This amounts to only building the editing superstructure around the parts of the document being edited rather than the whole document. It’s like building the scaffolding around only the floors being worked on in a skyscraper rather than trying to envelope the entire building in scaffolding.
  • Modularity – We’re building a number of very different applications currently – all of which require XML editing. To allow this variability, our new XML editor is written as a web-based component rather than a full-fledged application. Despite its complexity, on the surface it’s deceivingly simple. It has no user interface at all aside from the editing canvas. It’s completely driven by a well thought out JavaScript API. Adding the editor to a document is very simple. A single link, added to the bottom of the XML document, adds the editor to the document. With this component, we’re able to include it within all of the applications we are building.
  • Configurability – We need to support a number of different models – not just Akoma Ntoso. To achieve this, an XML-based configuration file is used to define the behaviors for any XML model. Elements can be defined as read-only, templates can be defined (or derived), and even the track changes behavior can be configured for individual elements. The sophistication being defined within the configuration files is to allow us to model all the variants of legislative models we have encountered without the need for extensive programming-level customization.
  • Browser Support – We’re pushing the envelope when it comes to browser support. Our current focus is on Google’s Chrome browser. Support for all the browsers aside from Internet Explorer should be relatively easy. Our experience has shown that the browsers are now quite similar. Internet Explorer is the one exception – in this particular area. Years ago, IE was the best browser when it came to XML support. While IE had many other compatibility issues, particularly with CSS, it led the way in supporting XML. However, while Microsoft has made tremendous strides moving forward to match the other browsers and modern standards, they’ve neglected XML. Their circa 1999 legacy capabilities for XML do no match modern standards and are quite deficient. Hopefully, this is something that will soon be rectified.

It’s not all smooth sailing. I have been finding a number of surprising issues with Google Chrome. For instance, whitespace management is a bit fudged at times. Chrome thinks nothing of adding the occasional non-breaking space to maintain whitespace when editing the DOM. What’s worse – it will inexplicably convert this into a text node that reads ” ” after a while. This is a character entity that is not defined in XML. I have to work hard to constantly reverse this odd behavior.

All in all, I’m excited by this new approach to building a web-based XML editor. It’s a substantial increase in sophistication over our prior web-based XML editor. This editor will be far more robust, scalable, and configurable in comparison to our prior editor and other editors we have worked on. While we still have a way to go in our development, we’ve found solutions to all the risky issues. It’s a future-looking approach – support can only get better. It doesn’t rely on compatibility modes or any other remnants of prior eras in web technology. This approach is really working out quite nicely for us.

Building a browser-based XML Editor

Tracking Changes with Legislative Drafting

We’re in the process of rebuilding our legislative editor – from the ground up. There are many reasons why we are doing this, which I will leave to my next blog. Today, I want to focus on the most important reason of all – change tracking.

Figure 1: The example above shows non-literal redlining and two different change contexts. An entire bill section is being added – the “action line” followed by quoted text. Rather than showing the entire text in an inserted notation, only the action line is shown. The quoted text reflects a different change context – showing changes relative to the law. In subsequent versions of this bill, the quoted text will no longer show the law as its change context but rather the prior version. It’s complicated!

For us, change tracking is an essential feature of any legislative editor. It’s not something that can be tacked on later or implemented via a customization – it’s a core feature which must be built in to the base editor from the very outset. Change tracking dictates much of the core architecture of the editor. That means taking the time to build in change tracking into the basic DOM structures that we’re building – and getting them right up front. It’s an amazingly complex problem when dealing with an XML hierarchy.

I’ve been asked a number of questions by people that have seen my work. I’ll try to address them here:

Why is change tracking so important? We use change tracking to implement a couple of very important features. First of all, we use it to implement redlining (highlighting the changes) in a bill as it evolves. In some jurisdictions, particularly in the United States, redlining is an essential part of any bill drafting system. It is used to show both how the legislation has evolved and how it affects existing law.

Secondly, we use it to automatically generate “instruction” amendments (floor or committee amendments). First, page and line markers are back-annotated into the existing bill. That bill is then edited to reflect the proposed changes – carefully crafting the edits using track changes to avoid political sensitivities – such as arranging a change so as not to strike out a legislator’s name. When complete, our amendment generator is used to analyze the redlining along with the page and line markers to produce the amendment document for consideration. The cool thing is that to execute the amendments, all we need to do is accept or reject the changes. This is something we call “Amendments in Context” and our customer calls “Automatic Generation of Instruction Amendments” (AGIA).

How is legislative redlining different from change tracking in Word? They’re very similar. In fact, the first time we implemented legislative redlining, we made the mistake of assuming that they were the same thing. What we learned was that legislative redlining is quite a bit more complex. First of all, the last version of the document isn’t the only change context. The laws being amended are another context which must be dealt with. This means that, within the same document, there are multiple original sources of information which must be compared against.

Secondly, legislative redlining has numerous conventions, developed over decades, to indicate certain changes that are difficult or cumbersome to show with literal redlining. These amount to non-literal redlining patterns which denote certain changes. Examples include showing that a paragraph is being merged or split, a provision is being renumbered, a whole bill is being gutted and replaced with all new text, and even that a section, amending law (creating a different change context), is being added to a new version of the bill.

The rules of redlining can be so complex and intricate that they require any built-in change tracking mechanism in an off-the-shelf editor to be substantially modified to meet the need. Our first legislative editor was implemented using XMetaL for the State of California. At first, we tried to use XMetaL’s change tracking mechanisms. These seemed to be quite well thought out, being based on Microsoft Word’s track changes. However, it quickly became apparent that this was insufficient as we learned the art of redlining. We then discovered, much to our alarm, that XMetaL’s change tracking mechanism was transparent to the developer and could not be programmatically altered. Our solution involved contracting the XMetaL team to provide us with a custom API that would allow us to control the change tracking dimension. The result works, but is very complex to deal with as a developer. That’s why they had hidden it in the first place.

Why can’t differencing be used to generate an amendments document? We wondered this as well. In fact, we implemented a feature, called “As Amends the Law” in our LegisWeb bill tracking software using this approach. But, it’s not that straight-forward. First of all, off-the-shelf differencers lack an understanding of the political sensitivities of amendments. What they produce is logically correct, but can be quite politically insensitive. The language of amendments is often very carefully crafted to not upset one side or another. It’s pretty much impossible to relay this to a program that views its task as simply comparing two documents. Put another way, a differencer will show what has changed rather than how it was changed.

Secondly, off-the-shelf differencers don’t understand all the conventions that exist to denote many of the types of amendments that can be made – especially all the non-literal redlining rules. Asking a legislative body to modify their decade’s old customs to accommodate the limitations of the software is an uphill battle.

What approaches to change tracking have you seen? XMetaL’s approach to change tracking is the most useful approach we’ve encountered in XML editors. As I already mentioned, its goal is to mimic the change tracking capabilities of Microsoft Word. It uses XML processing instructions very cleverly to capture the changes – one processing instruction per deletion and a pair for insertions. The beauty of this approach is that it isolates the challenge of change tracking from the document schema – ensuring wide support for change tracking without any need to adapt an existing schema. It also allows the editor to be customized without regard for the change tracking mechanisms. The change tracking mechanisms exist and operate in their own dimension – very nicely isolated from the main aspects of editing. However, when you need to program software in this dimension, the limited programmability and immense complexity becomes a drawback.

Xopus, a web-based editor, tries to mimic XMetaL’s approach – actually using the same processing instructions as XMetaL. However, it’s an apparent effort to tack on change tracking to an existing editor and the result is limited to only tracking changes within text strings. They’ve seemingly never been able to implement a full featured change tracking mechanism. This limits its usefulness substantially.

Another approach is to use additional elements in a special namespace. This is the approach taken by ArborText. The added elements (nine in all), provide a great deal of power in expressing changes. Unfortunately, the added complexity to the customizer is quite overwhelming. This is why XMetaL’s separate change dimension works so well – for most applications.

Our approach is to follow the model established by XMetaL, but to ensure the programmability we need to implement legislative redlining and amendment generation. In the months to come, I will describe all this in much more detail.

Tracking Changes with Legislative Drafting

Improving Legal References

In my blog last week, I talked a little about our efforts to improve how citations are handled. This week, I want to talk about this in some more detail. I’ve been participating on a few projects to improve how citations and references to legal citations are handled.

Let’s start by looking at the need. Have you noticed how difficult it is to lookup many citations found in legislation published on the web? Quite often, there is no link associated with the citation. You’re left to do your own legwork if you want to lookup that citation – which probably means you’ll take the author’s word for it and not bother to follow the citation. Sometimes, if you’re lucky, you will find a link (or reference) associated with the citation. It will point to a location, chosen by the author, that contains a copy of the legal text being referenced.

What’s the problem with these references?

  • If you take a look at the reference, chances are it’s a crufty URL containing all sorts of gibberish that’s either difficult or impossible to interpret. The URL reflects the current implementation of the data provider. It’s not intended to be meaningful. It follows no common conventions for how to describe a legal provision.
  • Wait a few years and try and follow that link again. Chances are, that link will now be broken. The data provider may have redesigned their site or it might not even exist anymore. You’re left with a meaningless link that points to nowhere.
  • Even if the data does exist, what’s the quality of the data at the other end of the link. Is the text official text, a copy, or even a derivative of the official text? Has the provision been amended? Has it been renumbered? Has it been repealed? What version of that provision are you looking at now? These questions are all hard to answer.
  • When you retrieve the data, what format does it come in? Is it a PDF? What if you want the underlying XML? If that is available, how do you get it?
  • The object of our efforts, both at the standards committee and within the projects we’re working on at Xcential, is to tackle this problem. The approach being taken involves properly designing meaningful URLs which are descriptive, unambiguous, and can last for a very long time – perhaps decades or longer. These URLs are independent of the current implementation – they may not reflect how the data is stored at all. The job of figuring out how to retrieve the data, using the current underlying content management system, is the job of a “resolver”. A resolver is simply an adapter that is attached to a web server. It intercepts the properly designed URL references and then transparently maps them into the crufty old URLs which the content management system requires. The data is retrieved from the content management system, formatted appropriately, and returned as if it really did exist at the property designed URL which you see. As the years go by and technologies advance, the resolver can be adapted to handle new generations of content management system. The references themselves will never need to change.

    There are many more details to go into. I’ll leave those for future blogs. Some of the problems we are tackling involve mapping popular names into real citations, working through ambiguities (including ones created in the future), handling alternate data sources, and allowing citations to be retrieved at varying degrees of granularity.

    I believe that solving the legal references problem is just about the most important progress we can make towards improving the legal informatics field. It’s an exciting time to be working in this field.

Improving Legal References

Legal Citations and XML Editing for Legislation

It’s been quite some time since my last blog post – almost six months. The reason is that I’ve been very busy. We are doing a lot of exciting development within Xcential. We are developing a number of quite challenging projects around the globe.

If you’ve been following my blog, you may remember that I was working on an HTML5-based XML editor. That development was two years ago now. We’ve come a long way since then. The basic editor has been stripped down, componentized, and has being rebuilt to be a far more robust, scalable, and adaptable solution. There are more details below, which I will expand upon as the editor rolls out over the next year.

    Legal Citations

It was almost a year ago since the last Legislative Data and Transparency Conference in Washington D.C. (The next one is coming up) At that time, I spoke about the need for improved citation management in published XML documents. Well, we’ve come a long way since then. Earlier this year a Technical Committee was formed within OASIS to begin developing some standards. The Legal Citation Markup Technical Committee is now hard at work defining markup models for legal citations. I am a member of that TC.

The reference management part of our HTML5-based editor has been separated out as a separate project – as a citation interpreter and reference resolver. In our development tests, it’s integrated with eXist as a local repository. We also source documents from external sources such as LII.

We now have a few citation management projects underway, using our resolver technology. These are exciting projects which will be a huge step forward in improving how citations are managed. It’s premature to talk about this in any detail, so I’ll just leave this as a teaser of stuff to come.

    XML Editing for Legislation

The OASIS Legal Document ML Technical Committee is getting ready to make a large announcement. While this progress is being made, at Xcential we’ve been hard at work refining the state-of-the-art in XML editing.

If you recall the HTML5-based editor for Akoma Ntoso from a couple of years back, you may remember that is was based around all the new HTML5 technologies that have recently been incorporated into web browsers. We learned a lot from that effort – both good and bad. While we were able to get a reasonable tagging editor, using facilities that made editing far easier, we still faced difficulties when it came to basic XML editing and scalability.

So, we’ve taken a more ambitious approach to produce a very generalized XML editing platform. Using what we learned as the basis, our new editor is far more capable. Rather than relying on the mapping of XML into an equivalent HTML5 structure, we now directly use the XML facilities that are built into the browser. This approach is both far more robust and far more scalable. But the most exciting aspect is change tracking. We’re building change tracking directly into the basic editing engine – from the outset. This means that we can track all changes – whether the changes are in the text or in the structure. With all browsers now correctly implementing the standardized DOM Range model, our change tracking model has to be very sophisticated. While it’s hellishly complex, my experience in implementing change tracking technologies over many years is really coming in handy.

If you’ve used change tracking in XMetaL, you know the limitations of their technology. XMetaL’s range selection constrains how you can select which limits the flexibility of deletion. This simplifies the problem for the XMetaL customizer, but at a serious usability price. It’s one of the biggest limiting factors of XMetaL. We’re dealing with this problem once and for all with our new approach – providing a great way to implement legislative redlining.

Redlining Take a look at the totally contrived example on the left. It’s admittedly not a real example, it comes from my stress testing of the change tracking facilities. But look at what it does. The red text is a complex deletion that spans elements with little regard to the structure. In our editor, this was done with a single “delete” operation. Try and do this with XMetaL – it takes many operations and is a real struggle – even with change tracking turned off. In fact, even Microsoft Word’s handling of this is less than satisfactory, especially in more recent versions. Behind the scenes, the editor is using the model, derived from the schema, to control this deletion process to ensure that a valid document is the result.

If you’re particularly familiar with XMetal, you will notice something else too. That deletion cuts through the structure of a table!!!! XMetaL can only track changes within the text of table cells, not the structure. We’re making great strides towards proper legislative redlining technologies, and we are excited to work with our partners and clients to put them into practice.

Legal Citations and XML Editing for Legislation

Sheldon’s Roommate Agreement – in Akoma Ntoso

A Legal Open Document Hackathon was held yesterday at the University of Bologna in Italy – focused on Akoma Ntoso documents. You can learn more about it here:

https://plus.google.com/u/0/events/c03pd1llrcvg7d0t0fj5sh41cbk
http://codexml.cirsfid.unibo.it/material-for-legal-open-document-dec-10/

I wasn’t able to directly participate but I had my own mini-hackathon as well. But, rather than focusing on another boring piece of legislation that nobody wants to read, I thought I would have a little fun with it. If you know me, you’ll know that I’m a huge fan of The Big Bang Theory television show. You could say I have a few Sheldon-like tendencies of my own.

sheldonsRoommateAgreement

I’ve often thought that the complex roommate agreement that Sheldon had Leonard sign would make a great example for a legal document modeled in Akoma Ntoso. Of course, it’s not a piece of legislation, but surprisingly, it has many of the attributes of legislation. It is even, much like legislation, a bit of a chaotic mess. I had to make a number of extrapolations or “fixes” in order to get a reasonably consistent and workable document. I sure hope, if Sheldon’s desire to win a Nobel prize in physics is to be realized, that he is a better theoretical physicist than he is a legal drafter. Perhaps we should offer to give him a few pointers in document theory and the logical organization of ideas – he really needs them.

Nonetheless, the example afforded me an opportunity to show a number of features Akoma Ntoso:

  1. In article 10, section 9, there is an example of conditional effectivity. The provision is only effective in the event the either roommate has a girlfriend. As Leonard has had a few on-again, off-again girlfriends, it was a bit of fun figuring out when this provision was in effect. I didn’t consider Amy to be Sheldon’s girlfriend as the pertinent issues have yet to arise.
  2. In season 5, episode 15, Sheldon wins back Leonard’s friendship by amending the agreement to add “Leonard’s Day”.
  3. There are a number of “addendums” to various articles. This isn’t something that is directly supported by Akoma Ntoso, so I used the extension facilities of Akoma Ntoso to add generic tags with @name attributes to model the extensions I needed.
  4. The agreement is a complex document made up of the main document and at least three appendices

Sheldon’s Roommate Agreement

I present to you, Sheldon’s Roommate Agreement, as much as is known to date and with a few “corrections” on my part:

<?xml version="1.0" encoding="UTF-8"?>
<akomaNtoso xmlns="http://docs.oasis-open.org/legaldocml/ns/akn/3.0/CSD05"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:schemaLocation="http://docs.oasis-open.org/legaldocml/ns/akn/3.0/CSD05 akomantoso30.xsd">
   <act name="roommateAgreement" contains="multipleVersions">
      <meta>
         <identification source="#sheldon">
            <FRBRWork>
               <FRBRthis value="/us/cbs/bigBangTheory/roommateAgreement/main"/>
               <FRBRuri value="/us/usc/bigBangTheory/roommateAgreement"/>
               <FRBRdate date="2013-12-10" name="generation"/>
               <FRBRauthor href="#sheldon" as="#lessor"/>
               <FRBRcountry value="us"/>
            </FRBRWork>
            <FRBRExpression>
               <FRBRthis
                  value="/us/cbs/bigBangTheory/roommateAgreement/en@/main"/>
               <FRBRuri value="/us/cbs/bigBangTheory/roommateAgreement/en@"/>
               <FRBRdate date="2013-12-10" name="generation"/>
               <FRBRauthor href="#sheldon" as="#lessor"/>
               <FRBRlanguage language="en"/>
            </FRBRExpression>
            <FRBRManifestation>
               <FRBRthis
                  value="/us/cbs/bigBangTheory/roommateAgreement/en@/main.xml"/>
               <FRBRuri value="/us/cbs/bigBangTheory/roommateAgreement/en@.akn"/>
               <FRBRdate date="2013-12-10" name="generation"/>
               <FRBRauthor href="#vergottini" as="#marker"/>
            </FRBRManifestation>
         </identification>
         <publication name="roommateAgreement" date="2013-12-10"
            showAs="Sheldon's Roommate Agreement"/>
         <lifecycle source="#sheldon">
            <eventRef id="gen__s_1__ep_1" date="2007-09-24" 
               source="#sheldon" type="generation"/>
            <eventRef id="gen__s_1__ep_17" date="2008-05-19"
               source="#sheldon" type="generation"/>
            <eventRef id="gen__s_2__ep_1" date="2008-09-22"
               source="#sheldon" type="generation"/>
            <eventRef id="gen__s_3__ep_1" date="2009-09-21" 
               source="#sheldon" type="generation"/>
            <eventRef id="gen__s_3__ep_19" date="2010-04-12"
               source="#sheldon" type="generation"/>
            <eventRef id="gen__s_4__ep_24" date="2011-05-19"
               source="#sheldon" type="generation"/>
            <eventRef id="gen__s_5__ep_7" date="2011-10-27"
               source="#sheldon" type="generation"/>
            <eventRef id="gen__s_5__ep_14" date="2012-01-26"
               source="#sheldon" type="generation"/>
            <eventRef id="gen__s_5__ep_15" date="2012-02-02"
               source="#sheldon" type="generation"/>
            <eventRef id="amd__s_5__ep_15" date="2012-02-02"
               source="#sheldon" type="amendment"/>            
         </lifecycle>
         <analysis source="#sheldon">
            <passiveModifications>
               <textualMod type="insertion" id="adds_leonards_day" 
                  incomplete="false">
                  <source href="#s_5__ep_15"/>
                  <destination href="#add_1"/>
               </textualMod>
            </passiveModifications>
         </analysis>
         <temporalData source="#sheldon">
            <temporalGroup id="period_1">
               <timeInterval refersTo="#signed" 
                  start="#gen_s_1__ep_1"/>
            </temporalGroup>
            <temporalGroup id="period_102">
               <timeInterval refersTo="#proposed" 
                  start="#gen__s_5__ep_15"/>
            </temporalGroup>
            <temporalGroup id="period_roommate_has_girlfriend">
               <timeInterval refersTo="#datingPenny1"
                  start="#gen__s_1__ep_17" end="#airing__s_2__ep_1"/>
               <timeInterval refersTo="#datingPriya1" 
                  start="#gen__s_3__ep_1" end="#airing__s_3__ep_19"/>
               <timeInterval refersTo="#datingPriya2"
                  start="#gen__s_4__ep_24" end="#airing__s_5__ep_7"/>
               <timeInterval refersTo="#datingPenny2"
                  start="#gen__s_5__ep_14"/>
            </temporalGroup>
         </temporalData>
         <references source="#bigBangTheory">
            <TLCRole id="lessor" 
               href="/ontology/role/lessor" 
               showAs="Lessor"/>
            <TLCRole id="lessee" 
               href="/ontology/role/lessee" 
               showAs="Lessee"/>
            <TLCRole id="marker" 
               href="/ontology/role/marker" 
               showAs="Lessee"/>
            <TLCPerson id="sheldon" 
               href="/ontology/person/cast/sheldonCooper"
               showAs="Sheldon Cooper"/>
            <TLCPerson id="roommate" 
               href="/ontology/person/cast/roommate"
               showAs="Roommate"/>
            <TLCPerson id="vergottini" 
               href="/ontology/person/xcential/vergottini"
               showAs="Grant Vergottini"/>            
            <TLCEvent id="s_2__ep_6"
               href="/ontology/tvShow/bigBangTheory/season2/episode6"
               showAs="Season 2 Episode 6"/>
            <TLCEvent id="s_2__ep_10"
               href="/ontology/tvShow/bigBangTheory/season2/episode10"
               showAs="Season 2 Episode 6"/>
            <TLCEvent id="s_3__ep_15"
               href="/ontology/tvShow/bigBangTheory/season3/episode15"
               showAs="Season 3 Episode 10"/>
            <TLCEvent id="s_3__ep_21"
               href="/ontology/tvShow/bigBangTheory/season3/episode21"
               showAs="Season 3 Episode 21"/>
            <TLCEvent id="s_3__ep_22"
               href="/ontology/tvShow/bigBangTheory/season3/episode22"
               showAs="Season 3 Episode 22"/>
            <TLCEvent id="s_4__ep_2"
               href="/ontology/tvShow/bigBangTheory/season4/episode2"
               showAs="Season 4 Episode 2"/>
            <TLCEvent id="s_4__ep_21"
               href="/ontology/tvShow/bigBangTheory/season4/episode21"
               showAs="Season 4 Episode 21"/>
            <TLCEvent id="s_4__ep_24"
               href="/ontology/tvShow/bigBangTheory/season4/episode24"
               showAs="Season 4 Episode 24"/>
            <TLCEvent id="s_5__ep_15"
               href="/ontology/tvShow/bigBangTheory/season5/episode15"
               showAs="Season 5 Episode 15"/>
            <TLCEvent id="s_5__ep_18"
               href="/ontology/tvShow/bigBangTheory/season5/episode18"
               showAs="Season 5 Episode 18"/>
            <TLCEvent id="s_6__ep_15"
               href="/ontology/tvShow/bigBangTheory/season6/episode15"
               showAs="Season 6 Episode 15"/>
         </references>
      </meta>

      <preface>
         <block name="title" id="title">
            <docTitle>The Roommate Agreement</docTitle>
         </block>
      </preface>

      <body period="period__1">
         <article id="art_1">
            <num>1</num>
            <heading id="art_1__heading">Upon becoming a roommate</heading>
            <section id="art_1__sec_1">
               <num>1</num>
               <content id="art_1__content">
                  <p>A roommate gets an ID Card, a lapel pin, FAQ sheet and a
                     key. New roommates may be interested in the live webchat on
                     Tuesday nights called “Apartment Talk.”</p>
               </content>
            </section>
            <section id="art_1__sec_3">
               <num>3)</num>
               <content id="art_1__sec_3__content">
                  <p>(call for an emergency meeting)</p>
                  <p class="sourceNote">(<ref id="ref_3" href="#s_2__ep_10"
                        >Season 2 Episode 10</ref>)</p>
               </content>
            </section>
            <section id="art_1__sec_5">
               <num>5</num>
               <subsection id="art_1__sec_5__ssec_A">
                  <num>A</num>
                  <content id="art_1__sec_5__ssec_A__content">
                     <p>Roommate must drive Sheldon to and from work, the comic
                        book store, the barber shop, and the park for one hour
                        every other Sunday for fresh air.</p>
                     <p class="sourceNote">(<ref id="ref_4" href="#s_3__ep_15"
                           >Season 3 Episode 15</ref>)</p>
                  </content>
               </subsection>
               <subsection id="art_1__sec_5__ssec_B">
                  <num>B</num>
                  <content id="art_1__sec_5__ssec_B__content">
                     <p>Roommate is tasked to bring home all take out dinners. (
                        Standard orders are located in Appendix B, and are also
                        down-loadable from Sheldon’s FTP server)</p>
                  </content>
               </subsection>
            </section>
            <section id="art_1__sec_9">
               <num>9</num>
               <heading id="art_1__sec_9__heading">Miscellany</heading>
               <paragraph id="art_1__sec_9__para_1">
                  <num>[1]</num>
                  <heading id="arc_1__sec_9__para_1__heading">Flag</heading>
                  <content id="art_1__sec_9__para_1__content">
                     <p>The apartment's flag is a gold lion rampant on a field
                        of azure” and should never fly upside down—unless the
                        apartment’s in distress.</p>
                     <p class="sourceNote">(<ref id="ref_1" href="#s_3__ep_22"
                           >Season 3 Episode 22</ref>_</p>
                  </content>
               </paragraph>
               <paragraph id="art_1__para_2">
                  <num>[2]</num>
                  <content id="art_1__para_2__content">
                     <p>If one of the roommates ever invents time travel, the
                        first stop has to aim exactly five seconds after this
                        clause of the Roommate Agreement was signed.</p>
                     <p class="sourceNote">(<ref id="ref_2" href="#s_3__ep_22"
                           >Season 3 Episode 22</ref>_</p>
                  </content>
               </paragraph>
            </section>
            <section id="art_1__sec_27">
               <num>27</num>
               <paragraph id="art_1__para_5">
                  <num>5</num>
                  <content id="art_1__para_5__content">
                     <p>The roommate agreement, like the American flag, cannot
                        touch the ground.</p>
                     <p class="sourceNote">(<ref id="ref_6" href="#s_6__ep_15"
                           >Season 6 Episode 15</ref>)</p>
                  </content>
               </paragraph>
            </section>
            <section id="art_1__sec_37">
               <num>37</num>
               <subsection id="art_1__sec_37__ssec_B">
                  <num>B</num>
                  <heading id="art_1__sec_37__ssec_B__heading">Miscellaneous
                     duties</heading>
                  <content id="art_1__sec_37__ssec_B__content">
                     <p>Roommate is obligated to drive Sheldon to his various
                        appointments, such as to the dentist. Roommate must also
                        provide a "confirmation sniff" to tell if questionable
                        dairy products are edible.</p>
                     <p class="sourceNote">(<ref id="ref_7" href="#s_3__ep_15"
                           >Season 3 Episode 15</ref>)</p>
                  </content>
               </subsection>
            </section>
            <section id="art_1__sec_209">
               <num>209</num>
               <content id="art_1__sec_209__content">
                  <p>Sheldon and roommate both have the option of nullifying
                     their roommate agreement, having no responsibilities or
                     obligations toward each other, other than paying rent and
                     sharing utilities.</p>
                  <p class="sourceNote">(<ref id="ref_8" href="#s_5__ep_15"
                        >Season 5 Episode 15</ref>)</p>
               </content>
            </section>
            <section id="art_1__sec_XX">
               <num>XX</num>
               <heading id="art_1__sec_XX__heading">Settling ties</heading>
               <content id="art_1__sec_XX__content">
                  <p>All ties will be settled by Sheldon.</p>
                  <p class="sourceNote">(<ref id="ref_9" href="#s_3__ep_22"
                        >Season 3 Episode 22</ref>)</p>
               </content>
            </section>
         </article>

         <article id="art_2">
            <num>2</num>
            <heading id="art_2__heading">Co-habitation</heading>
            <section id="art_2__sec_1">
               <num>1</num>
               <subsection id="art_2__sec_1__ssec_A">
                  <num>A</num>
                  <content id="art_2__sec_1__ssec_A__content">
                     <p>No "hootennanies", sing-alongs, raucous laughter,
                        clinking of glasses, celebratory gunfire, or barbershop
                        quartets after 10.p.m.</p>
                     <p class="sourceNote">(<ref id="ref_10" href="#s_5__ep_18"
                           >Season 5 Episode 18</ref>)</p>
                  </content>
               </subsection>
               <subsection id="art_2__sec_1__ssec_B">
                  <num>B</num>
                  <content id="art_2__sec_1__ssec_B__content">
                     <p>Roommate does not now nor does intend to play percussive
                        or brass instruments.</p>
                  </content>
               </subsection>
               <subsection id="art_2__sec_1__ssec_C">
                  <num>C</num>
                  <heading id="art_2__sec_1__ssec_C__heading"
                     >Temperature</heading>
                  <content id="art_2__sec_1__ssec_C__content">
                     <p>The thermostat must be kept at 71 degrees at all
                        times.</p>
                     <p class="sourceNote">(<ref id="ref_11" href="#s_3__ep_22"
                           >Season 3 Episode 22</ref>)</p>
                  </content>
               </subsection>
            </section>
            <section id="art_2__sec_2">
               <num>2</num>
               <heading id="art_2__sec_2__heading">Television and
                  movies</heading>
               <subsection id="art_2__sec_2__ssec_B">
                  <num>B</num>
                  <content id="art_2__sec_2__ssec_B__content">
                     <p>Roommates agree that Friday nights shall be reserved for
                        watching Joss Whedon's brilliant new series Firefly.</p>
                     <p class="soruceNote">(<ref id="ref_12" href="#s_3__ep_15"
                           >Season 3 Episode 15</ref>)</p>
                  </content>
               </subsection>
            </section>
            <section id="art_2__sec_3">
               <num>3</num>
               <content id="art_2__sec_3__content">
                  <p>Roommate has the right “to allocate fifty percent of the
                     cubic footage of the common areas”, but only if Sheldon is
                     notified in advance by e-mail.</p>
                  <p class="sourceNote">(<ref id="ref_13" href="#s_3__ep_22"
                        >Season 3 Episode 22</ref>)</p>
               </content>
            </section>
            <section id="art_2__sec_4">
               <num>4</num>
               <heading id="art_2__sec_4__heading">[Pets]</heading>
               <content id="art_2__sec_4__content">
                  <p>Pets are banned under the roommate agreement, with the
                     exception of service animals, like cybernetically-enhanced
                     helper monkeys.</p>
                  <p class="sourceNote">(<ref id="ref_14" href="#s_3__ep_21"
                        >Season 3 Episode 21</ref>)</p>
               </content>
            </section>
            <section id="art_2__sec_5">
               <num>5</num>
               <heading id="art_2__sec_5__heading">[Take-out
                  restaurant]</heading>
               <content id="art_2__sec_5__content">
                  <p>The selection of a new take-out restaurant requires public
                     hearings and a 60-day comment period.</p>
                  <p class="sourceNote">(<ref id="ref_15" href="#s_4__ep_21"
                        >Season 4 Episode 21</ref>)</p>
               </content>
            </section>
            <hcontainer name="addendums" id="art_2_addendums">
               <hcontainer name="addendum" id="art_2_add_A">
                  <num>A</num>
                  <content id="art_d_add_A_content">
                     <p>Sheldon [will] ask at least once a day how roommate is
                        even if he doesn't care.</p>
                     <p class="sourceNode">(<ref id="ref_16" href="#s_3__ep_15"
                           >Season 3 Episode 15</ref>)</p>
                  </content>
               </hcontainer>
               <hcontainer name="addendum" id="art_2__add_B">
                  <num>B</num>
                  <content id="art_2__add_B__content">
                     <p>Sheldon [will] no longer stage spontaneous biohazard
                        drills after 10 p.m. </p>
                     <p class="sourceNote">(<ref id="ref_17" href="#s_3__ep_15"
                           >Season 3 Episode 15</ref>)</p>
                  </content>
               </hcontainer>
               <hcontainer name="addendum" id="art_2__add_C">
                  <num>C</num>
                  <content id="art_2__add_C__content">
                     <p>Sheldon [will] abandon his goal to master Tuvan throat
                        singing. </p>
                     <p class="sourceNote">(<ref id="ref_18" href="#s_3__ep_15"
                           >Season 3 Episode 15</ref> )</p>
                  </content>
               </hcontainer>
            </hcontainer>
         </article>

         <article id="art_3">
            <num>3</num>
            <heading id="art_3__heading">The Bathroom</heading>
            <section id="art_3__sec_1">
               <num>1</num>
               <content id="art_3__sec_1__content">
                  <p>Roommates will acknowledge and use the two pieces of tape
                     in the bathroom designated for specific purposes:</p>
                  <blockList id="art_3__sec_1__content_ul_1">
                     <item id="art_3__sec_1__content_ul_1__item_1">
                        <p>Tape A: Located in front of the sink. Person must brush
                        and floss teeth behind the line.</p>
                     </item>
                     <item id="art_3__sec_1__content_ul_1__item_2">
                        <p>Tape B: Located in front of the toilet, those who stand
                        up to pee must stand in front of it.</p>
                     </item>
                  </blockList>
               </content>
            </section>
            <section id="art_3__sec_2">
               <num>2</num>
               <content id="art_3__sec_2__content">
                  <p>Before the use of a shower, the party agrees to wash his or
                     her feet in his or her designated bucket.</p>
               </content>
            </section>
            <section id="art_3__sec_7">
               <num>7</num>
               <intro id="art_3__sec_8__intro">
                  <p>The shower can have at most one occupant, except in the
                     event of an attack by water soluble aliens.</p>
               </intro>
               <subsection id="art_3__sec_8__ssec_B">
                  <num>B</num>
                  <paragraph id="art_3__sec_8__ssec_B__para_9">
                     <num>9</num>
                     <content id="art_3__sec_8__ssec_B__para_9__content">
                        <p>The right to bathroom privacy is suspended in the
                           event of force majeure.</p>
                        <p class="sourceNote">(<ref id="ref_19"
                              href="#s_4__ep_21">Season 4 Episode 21</ref>)</p>
                     </content>
                  </paragraph>
               </subsection>
            </section>
            <hcontainer name="addendums" id="art_3__addendums">
               <hcontainer name="addendum" id="art_3__add_J">
                  <num>J</num>
                  <content id="art_d__add_J__content">
                     <p>When Sheldon showers second, any and all measures shall
                        be taken to ensure an adequate supply of hot water.</p>
                     <p class="sourceNote">(<ref id="ref_20" href="#s_4__ep_21"
                           >Season 4 Episode 21</ref>)</p>
                  </content>
               </hcontainer>
            </hcontainer>
         </article>

         <article id="art_10">
            <num>10</num>
            <heading id="art_10__heading">Visitors</heading>
            <section id="art_10__sec_8">
               <num>8</num>
               <heading id="art_10__sec_8__heading">Over-night guests</heading>
               <intro id="art_10__sec_8__intro">
                  <p>There has to be a 24 hour notice if a non-related female
                     will stay over night.</p>
                  <p>(<ref id="ref_21" href="#s_3__ep_21">Season 3 Episode
                        21</ref>)</p>
               </intro>
               <subsection id="art_10__sec_8__ssec_C">
                  <num>C</num>
                  <heading id="art_1-__sec_8__ssec_C__heading">Females</heading>
                  <paragraph id="art_10__sec_8__ssec_C__para_4">
                     <num>4</num>
                     <heading id="art_10__sec_8__ssec_C__para_4__heading"
                        >Coitus</heading>
                     <content id="art_10__sec_8__ssec_C__para_4__content">
                        <p>Roommates shall give each other 12 hours notice of
                           impending coitus.</p>
                        <p class="sourceNote">(<ref id="ref_22"
                              href="#s_3__ep_22">Season 3 Episode 22</ref>)</p>
                     </content>
                  </paragraph>
               </subsection>
            </section>
            <section id="art_10__sec_9" period="period__roommate_has_girlfriend">
               <num>9</num>
               <heading id="art_10__sec_9__heading">Cohabitation Rider</heading>
               <intro id="art_10__sec_9__content">
                  <p>[This clause is] activated when roommate starts "living
                     with" a girlfriend in the apartment.</p>
                  <p>A girlfriend shall be deemed "living with" roommate when
                     she has stayed over for A: ten consecutive nights or B: for
                     more than nine nights in a three-week period or C: all the
                     weekends of a given month plus three weeknights.</p>
                  <p class="sourceNote">(<ref id="ref_23" href="#s_2__ep_10"
                        >Season 2 Episode 10</ref>)</p>
               </intro>
               <subsection id="art_10__sec_9__ssec_A">
                  <num>A</num>
                  <content id="art_10__sec_9__ssec_A__content">
                     <p>Upon a live-in girlfriend, there shall be a change in
                        the distribution of shelves in the fridge.</p>
                     <p class="sourceNote">(<ref id="ref_24" href="#s_2__ep_10"
                           >Season 2 Episode 10</ref>)</p>
                  </content>
               </subsection>
               <subsection id="art_10__sec_9__ssec_B">
                  <num>B</num>
                  <content id="art_10__sec_9__ssec_B__content">
                     <p>Apartment vacuuming shall be increased from three to
                        four times a week to accommodate the increased
                        accumulation of dead skin cells.</p>
                     <p class="sourceNote">(<ref id="ref_25" href="#s_2__ep_10"
                           >Season 2 Episode 10</ref>)</p>
                  </content>
               </subsection>
               <subsection id="art_10__sec_9__ssec_C">
                  <num>C</num>
                  <content id="art_10__sec_9__ssec_C__content">
                     <p>A change in the bathroom schedule shall be
                        implemented.</p>
                     <p class="sourceNote">(<ref id="ref_26" href="#s_2__ep_10"
                           >Season 2 Episode 10</ref>)</p>
                  </content>
               </subsection>
               <subsection id="art_10__sec_9__ssec_D">
                  <num>D</num>
                  <content id="art_10__sec_9__ssec_D__content">
                     <p>Girlfriend does not now nor does she intend to play
                        percussive or brass instruments.</p>
                     <p class="sourceNote">(<ref id="ref_27" href="#s2__ep_10"
                           >Season 2 Episode 10</ref>)</p>
                  </content>
               </subsection>
            </section>
         </article>

         <section id="sec_XX">
            <num>XX</num>
            <heading id="sec_XX__heading">Durable power of attorney</heading>
            <content id="sec_XX__content">
               <p>The other roommate get power of attorney over you, and may
                  make end-of-life decisions for you (reciprocal).</p>
               <p class="sourceNote">(<ref id="ref_28" href="#s_4__ep_24">Season
                     4 Episode 24</ref>)</p>
            </content>
         </section>

         <hcontainer name="addendums" id="addendums" period="period__102">
            <hcontainer name="addendum" id="add_1">
               <num>1</num>
               <heading id="add_1__heading">Leonard's Day</heading>
               <content id="add_1__content">
                  <p>Once a year, Leonard and Sheldon take one day to celebrate
                     the contributions Leonard gives to Sheldon's life, both
                     real and imaginary. Leonard does not get breakfast in bed,
                     the right to sit in Sheldon's spot, or permission to alter
                     the thermostat; the only thing that Leonard gets is a
                     thank-you card. This day is called "Leonard's Day."</p>
                  <p class="sourceNote">(<ref id="ref_29" href="#s_5__ep_15"
                        >Season 5 Episode 15</ref>)</p>
               </content>
            </hcontainer>
         </hcontainer>

      </body>
      <attachments>
         <doc name="appendix">
            <meta>
               <identification source="#sheldon">
                  <FRBRWork>
                     <FRBRthis value="/us/cbs/bigBangTheory/takeOutOrders/main"/>
                     <FRBRuri value="/us/cbs/bigBangTheory/takeOutOrders"/>
                     <FRBRdate date="2013-07-26T05:14:38" name="generation"/>
                     <FRBRauthor href="#sheldon" as="#lessor"/>
                     <FRBRcountry value="us"/>
                  </FRBRWork>
                  <FRBRExpression>
                     <FRBRthis
                        value="/us/cbs/bigBangTheory/takeOutOrders/en@/main"/>
                     <FRBRuri value="/us/cbs/bigBangTheory/takeOutOrders/en@"/>
                     <FRBRdate date="2013-07-26T05:14:38" name="generation"/>
                     <FRBRauthor href="#sheldon" as="#lessor"/>
                     <FRBRlanguage language="en"/>
                  </FRBRExpression>
                  <FRBRManifestation>
                     <FRBRthis
                        value="/us/cbs/bigBangTheory/takeOutOrders/en@/main.xml"/>
                     <FRBRuri
                        value="/us/cbs/bigBangTheory/takeOutOrders/en@.akn"/>
                     <FRBRdate date="2013-07-26T05:14:38" name="generation"/>
                     <FRBRauthor href="#vergottini" as="#marker"/>
                  </FRBRManifestation>
               </identification>
            </meta>
            <preface>
               <block name="title">
                  <docNumber>Appendix B</docNumber>
                  <docTitle>Standard take-out orders</docTitle>
               </block>
            </preface>
            <mainBody>
               <clause id="app_B__cls_1"> </clause>
            </mainBody>
         </doc>
         <doc name="appendix">
            <meta>
               <identification source="#sheldon">
                  <FRBRWork>
                     <FRBRthis
                        value="/us/cbs/bigBangTheory/futureCommitments/main"/>
                     <FRBRuri value="/us/cbs/bigBangTheory/futureCommitments"/>
                     <FRBRdate date="2013-07-26T05:14:38" name="generation"/>
                     <FRBRauthor href="#sheldon" as="#lessor"/>
                     <FRBRcountry value="us"/>
                  </FRBRWork>
                  <FRBRExpression>
                     <FRBRthis
                        value="/us/cbs/bigBangTheory/futureCommitments/en@/main"/>
                     <FRBRuri
                        value="/us/cbs/bigBangTheory/futureCommitments/en@"/>
                     <FRBRdate date="2013-07-26T05:14:38" name="generation"/>
                     <FRBRauthor href="#sheldon" as="#lessor"/>
                     <FRBRlanguage language="en"/>
                  </FRBRExpression>
                  <FRBRManifestation>
                     <FRBRthis
                        value="/us/cbs/bigBangTheory/futureCommitments/en@/main.xml"/>
                     <FRBRuri
                        value="/us/cbs/bigBangTheory/futureCommitments/en@.akn"/>
                     <FRBRdate date="2013-07-26T05:14:38" name="generation"/>
                     <FRBRauthor href="#vergottini" as="#marker"/>
                  </FRBRManifestation>
               </identification>
            </meta>
            <preface>
               <block name="title">
                  <docNumber>Appendix C</docNumber>
                  <docTitle>Future commitments</docTitle>
               </block>
            </preface>
            <mainBody>
               <clause id="app_c__cls_37">
                  <num>37</num>
                  <heading id="app_c__cls_37__heading">[Large Hadron
                     Collider]</heading>
                  <content id="app_c__cls_37__content">
                     <p>In the event one friend is ever invited to visit the
                        Large Hadron Collider, now under construction in
                        Switzerland, he shall invite the other friend to
                        accompany him.</p>
                     <p>(<ref id="app_c__ref_1" href="#s_3__ep_15">Season 3
                           Episode 15</ref>)</p>
                  </content>
               </clause>
               <clause id="app_c__cls_AA">
                  <num>[AA]</num>
                  <heading id="app_c__cls_AA__heading">[Super powers]</heading>
                  <content id="app_c__cls_AA__content">
                     <p>Specifies what happens if one friend gets super powers
                        (he will name the other one as his sidekick)</p>
                     <p class="sourceNote">(<ref id="app_c__ref_2"
                           href="#s_2__ep_10">Season 2 Episode 10</ref>)</p>
                     <p class="sourceNote">(<ref id="app_c__ref_3"
                           href="#s_3__ep_15">Season 3 Episode 15</ref>)</p>
                  </content>
               </clause>
               <clause id="app_c__cls_BB">
                  <num>[BB]</num>
                  <heading id="app_c__cls_BB__heading">[Zombies]</heading>
                  <content id="app_c__cls_BB__content">
                     <p>Specifies what happens if one friend is bitten by a
                        Zombie (the other can't kill him even if he turned)</p>
                     <p class="sourceNote">(<ref id="app_c__ref_4"
                           href="#s_3__ep_15">S3 Ep15</ref>)</p>
                  </content>
               </clause>
               <clause id="app_c__cls_CC">
                  <num>[CC]</num>
                  <heading id="app_c__cls_CC__heading">[MacArthur
                     grant]</heading>
                  <content id="app_c__cls_CC__content">
                     <p>Specifies what happens if one friend wins a MacArthur
                        grant </p>
                     <p class="sourceNote">(<ref id="app_c__ref_5"
                           href="#s_3__ep_15">Season 3 Episode 15</ref>)</p>
                  </content>
               </clause>
               <clause id="app_c__cls_DD">
                  <num>[DD]</num>
                  <heading id="app_c__cls_DD__heading">[Bill Gates]</heading>
                  <content id="app_c__cls_DD__content">
                     <p>Specifies what happens if one friend gets invited to go
                        swimming at Bill Gate's house (he will take the other
                        friend to accompany him)</p>
                     <p class="sourceNote">(<ref id="app_c__ref_6"
                           href="#s_3__ep_15">Season 3 Episode 15</ref>)</p>
                  </content>
               </clause>
               <clause id="app_c__cls_EE">
                  <num>[EE]</num>
                  <heading id="app_c__cls_EE__heading">[Skynet]</heading>
                  <content id="app_c__cls_EE__content">
                     <p>Specifies what happens if one friend needs help to
                        destroy an artificial intelligence he's created and
                        that's taking over Earth.</p>
                     <p class="sourceNote">(<ref id="app_c__ref_7"
                           href="#s_2__ep_6">Season 2 Episode 6</ref>)</p>
                  </content>
               </clause>
               <clause id="app_c__cls_FF">
                  <num>[FF]</num>
                  <heading id="app_c__cls_FF__heading">[Body
                     snatchers]</heading>
                  <content id="app_c__cls_FF__content">
                     <p>Specifies what happens if one friend needs help to
                        destroy someone they know who's been replaced with an
                        alien pod.</p>
                     <p class="sourceNote">(<ref id="app_c__ref_8"
                           href="#s_2__ep_6">Season 2 Episode 6</ref>)</p>
                  </content>
               </clause>
               <clause id="app_c__cls_GG">
                  <num>[GG]</num>
                  <heading id="app_c__cls_GG__heading">[Godzilla]</heading>
                  <content id="app_c__cls_GG__content">
                     <p>Specifies what happens if someone threatens to destroy
                        Tokyo.</p>
                     <p class="sourceNote">(<ref id="app_c__ref_9"
                           href="#s_2__ep_6">Season 2 Episode 6</ref>)</p>
                  </content>
               </clause>
               <clause id="app_c__cls_74">
                  <num>74</num>
                  <subclause id="app_c__cls_74__scls_C">
                     <num>C</num>
                     <heading id="app_c__cls_74__scls_C__heading"
                        >[Robots]</heading>
                     <content id="app_c__cls_74__scls_C__content">
                        <p>The various obligations and duties of the parties in
                           the event one of them becomes a robot.</p>
                        <p class="sourceNote">(<ref id="app_c__ref_10"
                              href="#s_4__ep_2">Season 4 Episode 2</ref>)</p>
                     </content>
                  </subclause>
               </clause>
            </mainBody>
         </doc>
      </attachments>
   </act>
</akomaNtoso>
Sheldon’s Roommate Agreement – in Akoma Ntoso

Is it time to rethink how we are governed?

We have seen the worst of our government in the past few weeks. Our politicians have seemingly forgotten that their mission is to solve problems. Instead, they’ve regressed back to settling differences through tribal conflict. Isn’t that something that we should have put behind us centuries ago?

Why is it that our politicians can never solve complex problems?

I have always been fascinated with complex problem solving. It’s why I found myself a job at the Boeing Company at the start of my career. My job was to find ways to use computer automation to help Boeing solve ever more complex problems. While at Boeing, I was introduced to the discipline of systems engineering.

In the 1940′s, with the urgency of World War II as the impetus, large systems integrators like Boeing and AT&T had to find a way to eliminate the unpredictability of trial and error engineering. That way was systems engineering – which replaced the guessing game of early engineering efforts with a predictable engineering discipline that would allow new complex systems to be reliably brought online very fast.

The results speak for themselves. It’s that discipline in engineering that has given us the tremendous advances in aeronautics and electronics in the decades that have followed. Those supercomputers most people carry in their pockets would never have been possible were it not for the discipline of systems engineering.

Systems engineering imposes a rigorous problem-solving process. – Requirements are analyzed and quantified, alternatives are thoroughly studied, and the most optimal solution is selected. Emotions are wrung out of the process as soon as possible. When a problem is too large or appears insurmountable, it is broken down into smaller problems that are solved individually. Each step along the way and every decision is exhaustively documented and reviewed by peers. It’s a scalable process that allows any problem, no matter how complex or difficult, to be tackled with a good probability of success.

Of course, it’s not a perfect process. There are plenty of strong opinions, politicking, and sometimes even special interests to deal with. However, engineers are able to handle this as they are trained to work through their differences to find the best answers. Engineers are taught to detect and avoid the pitfalls of relying on opinions and ideology. Instead, they must relentlessly seek true and indisputable facts. Being able to do this effectively is a condition of employment. Engineers that can’t follow the process must be let go – businesses simply cannot afford to keep underperformers.

The problems that systems engineers must tackle are many times more complex than anything that our politicians will ever have to address. While the results are never perfect, and challenges abound, when a new plane makes its way out to the runway for that first flight, it’s a certainty that it will fly. The discipline of the process almost guarantees it.

Contrast this to the way our politicians solve problems. In the unlikely event that their metaphorical plane will ever find its way out to a runway, chances are it will come to an ugly end at the end of the runway crumpling into a pile of wishful thinking and intentional sabotage.

What’s the difference? Simply put, in systems engineering, opinions are suppressed and facts are emphasized while politicians seem to practice the exact opposite of this.

Why is it that we intuitively understand that the world’s most complex problems cannot be solved by people who rely on opinions and ideology, and yet that is exactly how we try to solve the world’s most important problems?

I am often asked what my vision is for legal informatics – the form of computer automation that targets legislative work. I’ve been pondering that question a lot over the past few weeks. Modern computing has revolutionized our lives. In the past twenty years alone, the way we interact with others, buy and sell products, keep ourselves entertained, and manage our lives has changed many times over thanks to computers and the Internet. Too often though, when I look at how we apply legal informatics, we’re simply computerizing outmoded nineteenth century processes – which, as we have seen in recent events, don’t work anymore.

I think it’s time that we rethink how we are governed – using the tools and technologies that have improved so many other aspects of our lives. Maybe then, we can have leaders who are problem solvers.

Is it time to rethink how we are governed?

The U.S. Code in Akoma Ntoso

I’m on my way to Italy this week for my annual pilgrimage to Ravenna, Italy and the LEX Summer School put on by the University of Bologna. This is my fourth trip to the class. I always find it so inspirational to be a part of the class and the activities that surround it. This year I will be talking about the many on-going projects that we have underway as well as talking, in depth, about the HTML5 editor I built for Akoma Ntoso.

Before I get to Italy, I wanted to share something I’ve been working on. It should come as absolutely no surprise to anyone that I’ve been working on producing a version of the U.S. Code in Akoma Ntoso. A few weeks ago, the U.S. Office of the Law Revision Counsel released the full U.S. Code in XML. My company, Xcential, helped them to produce that release. Now I’ve taken the obvious next step and begun work on a transform to convert that XML into Akoma Ntoso – the format currently being standardized by the OASIS Legal Document ML technical committee. I am an active member of that TC.

U.S. Code

About 18 months ago, I learned of a version of the U.S. Code that had been made available in XML. While that XML release was quite far from complete, I used to to produce a representation in Akoma Ntoso as it stood back then. My latest effort is a replacement and update of that work. The new version of XML released by the OLRC is far more accurate and complete and is a better basis for the transform than earlier release was. And besides, I have a far better understanding of the new version – having had a role in its development.

My work is still very much a work-in-progress. I believe in openly sharing my work in the hope of inspiring other to dive into this subject – so I’m releasing a partial first step in order to get some feedback. Please note that this work is a personal effort – it is not a part of our work with the OLRC. At this point I’ve written a transform to produce Akoma Ntoso XML according to the most recent schema released a few weeks ago. The transform is not finished, but it gives a pretty good rendition of the U.S. Code in Akoma Ntoso. I’m using the transform as a vehicle to identify use cases and issues which I can bring up with the OASIS TC at our weekly meetings. As a result, there are a few open issues and the resulting XML does not fully validate.

I’m making 8 Titles available now. They’re smaller Titles which are easier for me to work with as I refine the transform. Actually, I do have the first 25 Titles converted into Akoma Ntoso, but I’ll need to address some performance and space issues with my tired old development server before I can release the full set. Hopefully, over the next few months, I’ll be able to complete this work.

When you look at the XML, you will notice a “proposed” namespace prefix. This simply shows proposed aspects of Akoma Ntoso that are not yet adopted. Keep in mind that this is all development work – do not assume that the transformation I am showing is the final end result.

I’m looking for feedback. Monica, Fabio, Veronique, and anyone else – if you see anything I got wrong or could model better, please let me know. If anyone finds the way I modeled something troubling, please let me know. I’m doing this work to open up a conversation. By trying Akoma Ntoso out in different usage scenarios, we can only make it better.

Don’t forget the Library of Congress’ Legislative Data Challenge. Perhaps my transformation of the U.S. Code can inspire someone to participate in the challenge.

The U.S. Code in Akoma Ntoso

Web-Based XML Legislative Editor Update

It’s been quite a while since I gave an update on our web-based XML legislative editor – LegisProweb. But that doesn’t mean that nothing has been going on. Quite the contrary, this has been a busy year for the editor project.

Let me first recap what the editor is. It’s an XML editor, written entirely around HTML5 technologies. It was first developed last year as the centerpiece to a Hackathon that Ari Hershowitz and I staged in San Francisco and around the world. While it is designed as a general purpose XML editor and can be configured to model any XML schema, it’s primarily configured to support Akoma Ntoso.

LegisProWeb

Since then, there has been a lot of continuing interest in the editor. If you attended the 2013 Legislative Data and Transparency Conference this past May in Washington DC, you may have noticed Jim Harper of the Cato Institute demonstrating their “Deepbills” project. The editor you saw is a heavily customized early version of LegisProweb, reconfigured to handle the XML format that the US Congress publishes legislation in.

And that’s not the only place where LegisProweb has been adopted. We’re in the finishing stages of a somewhat larger implementation we did for Chile. This is an Akoma Ntoso implementation – focused on debates and debate reports rather than on legislation. One interesting point worth noting – this implementation is done in Spanish. LegisProweb is quite easily localized.

The common thread between these two implementations in the use case – they’re both implementations focused on tagging metadata within pre-existing documents rather than on creating new documents from scratch. This was the focus of the Hackathon we staged back in 2012 – little did we know how much of a market would exist for an editor focused on annotation rather than document creation. And there’s more to still come – we’ve been quite surprised in the level of interest in this particular use-case.

Of course, we’re not satisfied with an editor that can only annotate existing documents. We’ve been hard at work turning the editor into a full-featured legislative editor that works equally well at creating new documents as it does at annotating existing documents. In addition, we’ve made the editor very customizalble as well as adding capabilities to manage the comments and discussions that might revolve around a document as it is being created and annotated.

Most recently, the editor has been upgraded to the latest version of Akoma Ntoso coming out of the OASIS legal document ML technical committee where I am an active member. Along with that effort, the validator has been separated to run as a standalone Akoma Ntoso validator. I talked about that in my blog last week. I’m busy using the validator as I work frantically to complete an Akoma Ntoso project I am working on this week. I’ll talk some more about this project next week.

So where do we go from here? Well, the first big effort is to modularize the technologies found within the editor. We now have a diverse set of customers right now and they can all benefit from the various bits and pieces that make up LegisProweb. By modularizing the pieces, we’ll be able to pick and choose which parts we use when and how. Separating out the validator was the first step. We’ll also be pulling out the reference resolver, attaching it to a native XML database, and partitioning out the client-side to allow the editing component to be used without the full editing environment offered by LegisProweb.

One challenge that remains is handling redlining – managing insertions and deletions. This is a very difficult subject – and one I tackled in the work I did implementing the XML editor used by the California legislature. I took a very different approach in trying to solve the problem with LegisProweb, but I’m not happy with the result. So, I’ll be returning to the proven approach we used way back when we built the original LegisPro editor on XMetaL.

As you can tell, we’ve got our work for the next year cut out for us.

Web-Based XML Legislative Editor Update

Free Akoma Ntoso Validator

How are people doing with the Library of Congress’ Akoma Ntoso Challenge? Hopefully, you’re making good progress, having fun doing it, and in so doing, learning a valuable new skill with this important emerging technology.

I decided to make it easy for someone without an XML Editor to validate their Akoma Ntoso documents for free. We all know how expensive XML Editors tend to be. If you’re like me, you’ve used up all the free trials you could get. I’ve separated the validation part of our LegisProweb editor from the editing base to allow it to be used as a standalone validator. Now, all you need to do is either provide a URL to your document or, even easier, drop the text into the text area provided and then click on the “Validate” button. You don’t even need to go find a copy of the Akoma Ntoso schema or figure out how to hook it up to your document – I do all that for you.

To use the validator, simply draft your Akoma Ntoso XML document, specifying the appropriate namespace using the @xmlns namespace declaration, and then paste a copy into the validator. I’ll go and find the schema and then validate your document for you. The validation results will be shown to you conveniently inline within your XML source to help you in making fixes. Don’t worry, we don’t record anything when you use the validator – it’s completely anonymous and we keep no record of your document.

You can validate either the 2.0 version of Akoma Ntoso or the latest 3.0 version which reflects the work of the OASIS LegalDocumentML committee. Actually, there are quite a few other formats that the validator also will work with innately and, by using xsi:schemaLocation, you can point to any XML schema you wish.

Give the free Akoma Ntoso XML Validator a try. You can access it here. Please send me any feedback you might have.

Validator1Input Form Validator2Validation Results
Free Akoma Ntoso Validator

U.S. House of Representatives release the U.S. Code in XML.

This week marked a big milestone for us. The U.S. House of Representatives released the U.S. Code in XML. You can see the announcement by the Speaker of the House, John Boehner (R-Ohio), here. This is a big step forward towards a more transparent Congress. As many of you know, my company, Xcential, has worked closely with the Law Revision Counsel on this project. It has been an honor to provide our expertise as part of our on-going efforts with the U.S. House of Representatives.

This project has been a great opportunity for us to update the U.S. House of Representatives technology platform by introducing new XML schema techniques along with robust and high performance conversion tools. Our eleven years in this field, working on an international scale, has given us valuable insights into XML techniques which we were able to bring to bear to ensure that success of this project.

The feedback has been very good:

As you can expect, members of the technical community have swiftly picked up on this release and are actively finding ways to use the data it provides. Josh Tauberer of GovTrack.us has already started – check out his work here. Why did I already know he would be the first to jump in. :)

Of course, if you know me, you’ll know that I also have something up my sleeve. I’ll be spending my weekends and evenings for the next few weeks to release an Akoma Ntoso transform coincident with an upcoming OASIS LegalDocML announcement. Keep watching my blog for more info.

This project has been one of numerous projects we are working on right now. We have a very similar project underway in Asia and an Akoma Ntoso project nearing completion using our HTML5-based editor, LegisProWeb, in South America. I’ll be providing an update on LegisProweb in the coming weeks.

U.S. House of Representatives release the U.S. Code in XML.

Akoma Ntoso Challenge by the Library of Congress

As many of you may have already read, the U.S. Library of Congress has announced a data challenge using Akoma Ntoso. The challenge lasts for three months and offers a $5,000 prize to the winner.

In this challenge, participants are asked to mark up four Congressional bills, provided as raw text, into Akoma Ntoso.

If you have the time to participate in this challenge and can fulfill all the eligibility rules, then I encourage you to step up to the challenge. This is a good opportunity to give Akoma Ntoso a try – to both learn the new model and to help us to identify any changes or adaptations that must be made to make Akoma Ntoso suitable for use with Congressional legislation.

You are asked, as part of you submission, to identify gaps in Akoma Ntoso’s design along with documenting the methodology you used to construct your solution to the four bills. You’re also encouraged to use any of the available open-source editors that are currently available for editing Akoma Ntoso and to provide feedback on their suitability to the task.

I would like to point out that I also provide an Akoma Ntoso editor at http://legisproweb.com. It is free to use on the web along with full access to all the information you need to customize the editor. However, while our customers do get an unrestricted internal license to the source code, our product is not open source. At the end of the day, I must still make a living. Nonetheless, I believe that you can use any editor you wish to create your four Akoma Ntoso documents – it’s just that the sponsors of the competition aren’t looking for feedback on commercial tools. If you do choose to use my editor, I’ll be there to provide any support you might need in terms of features and bug fixes to help speed you on your way.

Akoma Ntoso Challenge by the Library of Congress

Transparent legislation should be easy to read

Legislation is difficult to read and understand. So difficult that it largely goes unread. This is something I learned when I first started building bill drafting systems over a decade ago. It was quite a let down. The people you would expect to read legislation don’t actually do that. Instead they must rely on analyses, sometimes biased, performed by others that omits many of the nuances found within the legislation itself.

Much of the problem is how legislation is written. Legislation is often written so as to concisely describe a set of changes to be made to existing law. The result is a document that is written to be executed by a law compilation team deep within the government rather than understood by law makers or the general public. This article, by Robert Potts, rather nicely sums up the problem.

Note: There is a technical error in the article by Robert Potts. The author states “These statutes are law, but since Congress has not written them directly to the Code, they are added to the Code as ‘notes,’ which are not law. So even when there is a positive law Title, because Congress has screwed it up, amendments must still be written to individual statutes.” This is not accurate. Statutory notes are law. This is explained in Part IV (E) of the DETAILED GUIDE TO THE CODE CONTENT AND FEATURES.

So how can legislation be made more readable and hence more transparent? The change must come in how amendments are written – with an intent to communicate the changes rather than just to describe them. Let’s start by looking at a few different ways that amendments can be written:

1) Cut-and-Bite Amendments

Many jurisdiction around the world use the cut-and-bite approach to amending, also known as amendments by reference. This includes Congress here in the U.S., but it is also common to most of the other jurisdictions I work with. Let’s take a look at a hypothetical cut-and-bite amendment:

SECTION 1. Section 1234 of the Labor Code is amended by repealing “$7.50″ and substituting “$8.50″.

There is no context to this amendment. In order to understand this amendment, someone is going to have to go look up Section 1234 of the Labor Code and manually make apply the change to see what it is all about. While this contrived example is simple, it already involves a fair amount of work. When you extrapolate this problem to a real bill and the sometimes convoluted state of the law, the effort to understand a piece of legislation quickly becomes mind-boggling. For a real bill, few people are going to have either the time or the resources to adequately research all the amendments to truly understand how they will affect the law.

2) Amendments Set Out in Full

I’ve come to appreciate the way the California Legislature handles this problem. The cut-and-bite style of amending, as described above, is simply disallowed. Instead, all amendments must be set out in full – by re-enacting the section in full as amended. This is mandated by Article 4, section 9 of the California Constitution. What this means is that the amendment above must instead be written as:

Section 1. Section 1234 of the Labor Code is amended to read:

1234. Notwithstanding any other provision of this part, the minimum wage for all industries shall be not less than $8.50 per hour.

This is somewhat better. Now we can see that we’re affecting the minimum wage – we have the context. The wording of the section, as amended, is set out in full. It’s clear and much more transparent.

However, it’s still not perfect. While we can see how the amended law will read when enacted, we don’t actually know what changed. Actually, in California, if you paid attention to the bill redlining through its various stages, you could have tracked the changes through the various versions to arrive at the net effect of the amendment. (See note on redlining) Unfortunately, the redlining rules are a bit convoluted and not nearly as apparent as they might seem to be – they’re misleading to the uninitiated. What’s more, the resulting statute at the end of the process has no redlining so the effect of the change is totally hidden in the enacted result.

Setting out amendments in full has been adopted by many states in addition to California. It is both more transparent and greatly eases the codification process. The codification process becomes simple because the new sections, set out in full, are essentially prefabricated blocks awaiting insertion into the law at enactment time. Any problems which may result from conflicting amendments are, by necessity, resolved earlier rather than later. (although this does bring along its own challenges)

3) Amendments in Context

There is an even better approach – which is adopted to varying degrees by a few legislatures. It is to build on the approach of setting out sections in full, but adds a visible indication of what has changed using strike and insert notation. I’ll refer to this as Amendments in Context.

This problem is partially addressed, at the federal level, by the Ramseyer Rule which requires that a separate document be published which essentially does shows all amendments in context. The problem is that this second document isn’t generally available – and it’s yet another separate document.

Why not just write the legislation showing the amendments in context to begin with? I can think of no reason other than tradition why the law, as proposed and enacted, shouldn’t show all amendments in context. Let’s take a look at this approach:

Section 1. Section 1234 of the Labor Code is amended to read:

1234. Notwithstanding any other provision of this part, the minimum wage for all industries shall be not less than $7.50 $8.50 per hour.

Isn’t this much clearer? At a glance we can see that the minimum wage is being raised a dollar. It’s obvious – and much more transparent.

At Xcential, we address this problem in California by providing an amendments in context view for all state legislation within our LegisWeb bill tracking service. We call this feature As Amends the LawTM and it is computed on-the-fly.

Governments are spending a lot of time, energy, and money on legislative transparency. The progress we see today is in making the data more accessible to computer analysis. Amendments in context would make the legislation not only more accessible to computer analysis – but also more readable and understandable to people.

Redlining Note: If redlining is a new term to you, it is similar to, but subtly different, to track changes in a word processor.

Transparent legislation should be easy to read

Legislative Data: The Book

Last week, as I was boarding the train at Admiralty station in Hong Kong to head back to the office, I learned that I am writing a book. +Ari made the announcement on his blog. It seems that Ari has found the key to getting me to commit to something – put me in a situation where not doing it is no longer an option. Oh well…

Nonetheless, there are many good reasons why now is a good time to write a book. In the past year we have experienced a marked increase in interest in the subject of legislative data. I think that a number of factors are driving this. First, there is renewed interest in driving towards a worldwide standard – especially the work being done by the OASIS LegalDocumentML technical committee. Secondly, the push for greater transparency, especially in the USA, is driving governments to investigate opening up their databases to the outside world. Third, many first generation XML systems are now coming due for replacement or modernization.

I find myself in a somewhat fortuitous position of being able to view these developments from an excellent vantage point. From my base in San Diego, I get to work with and travel to legislatures around the world on a regular basis. This allows me to see the different ways people are solving the challenges of implementing modern legislative information managements systems. What I also see, is how many jurisdictions struggle to set aside obsolete paper-based models for how legislative data should be managed. In too many cases, the physical limitations of paper are used to define the criteria for how digital systems should work. Not only do these limitations hinder the implementation of modern designs, they also create barriers that will prevent fulfilling the expectations that come as people adapt to receiving their information online rather than by paper.

The purpose of our book will be to propose a vision for the future of legislative data. We will share some of our experiences around the world – focusing on the successes some legislatures have had as they’ve broken legacy models for how things must work. In some cases the changes involve simply better separating the physical limitations of the published form from the content and structure. In other cases, we’ll explain how different procedures and conventions can not only facilitate the legislative process, but also make it more open and transparent.

We hope that by producing a book on the subject, we can help clear the path for the development of a true industry to serve this somewhat laggard field. This will create the conditions that will allow a standard, such as Akoma Ntoso, to thrive which, in turn, will allow interchangeable products to be built to serve legislatures around the world. Achieving this goal will reduce the costs and the risks of implementing legislative information management systems and will allow the IT departments of legislatures to meet both the internal and external requirements being placed upon them.

Ari extended an open invitation to everyone to propose suggestions for topics for us to cover. We’ve already received a lot of good interest. Please keep your ideas coming.

Legislative Data: The Book

2013 Legislative Data and Transparency Conference

Last week I participated in the 2013 Legislative and Transparency Conference put on by the U.S. House of Representatives in Washington D.C.

It was a one day event that featured numerous speakers both within the U.S. government and in the surrounding transparency community around D.C. My role, at the end of the day, was to speak as a panelist along with Josh Tauberer of GovTrack.us and Anne Washington of The George Washington University on Under-Digitized Legislative Data. It was a fun experience for me and allowed me to have a friendly debate with Josh on API’s versus bulk downloads of XML data. In the end, while we both fundamentally agree, he favors bulk downloads while I favor APIs. It’s a simple matter of how we use the data.

The morning sessions were all about the government reporting the progress they have made over the past year relating to their transparency initiatives. There has been substantial progress this year and this was evident in the various talks. Particularly exciting was the progress that the Library of Congress is making in developing the new congress.gov website. Eventually this website will expand to replace THOMAS entirely.

The afternoon sessions were kicked off by Gherardo Casini of the UN-DESA Global Centre for ICT in Parliament in Rome, Italy. He gave an overview of the progress, or lack thereof, of XML in various parliaments and legislatures around the world. He also gave a brief mention of the progress in the LegalDocumentML Technical Committee at OASIS which is working towards the standardization of Akoma Ntoso. I am a member of that technical committee.

The next panel was a good discussion on extending XML. The panelists were Eric Mill at the Sunlight Foundation who, among other things, talked about the HTML transformation work he has been exploring in recent weeks. I mentioned his efforts in my blog last week. Following him was Jim Harper at the Cato Institute. He talked about the Cato Institute’s Deepbills project. Finally, Daniel Bennett gave a talk on HTML and microdata. His interest in this subject was also mentioned in my blog last week.

One particularly fun aspect of the conference was walking into the entrance and noticing the Cato Institute’s Deepbills editor running on the table at the entrance. The reason it was fun for me is that their editor is actually a customization of an early version of the HTML5-based LegisPro Web editor which I have spent much of the past year developing. We have developed this editor to be an open and customizable platform for legislative editing. The Cato Project is one of four different implementations which now exist – two are Akoma Ntoso based and two are not. More news will come on this development in the not-too-distant future. I had not expected the Cato Institute to be demonstrating anything and it was quite a nice surprise to see software I had written up on the display.

If there was any recurring theme throughout the day, it was the call for better linked data. While there has been significant progress over the past year towards getting the data out there, now it is time to start linking it all together. Luckily for me, this was the topic I had chosen to focus on in my talk at the end of the day. It will be interesting to see the progress that is made towards this objective this time next year.

All in all, it was a very successful and productive day. I didn’t have a single moment to myself all day. There were so many interesting people to meet that I didn’t get a chance to chat with nearly as many as I would have liked to.

For an amusing yet still informative take on the conference, check out Ari Hershowitz’s Tabulaw blog. He reveals a little bit more about some of the many projects we have been up to over the past year.

https://cha.house.gov/2013-legislative-data-and-transparency-conference

2013 Legislative Data and Transparency Conference

XML, HTML, JSON – Choosing the Right Format for Legislative Text

I find I’m often talking about an information model and XML as if they’re the same thing. However, there is no reason to tie these two things together as one. Instead, we should look at the information model in terms of the information it represents and let the manner in which we express that information be a separate concern. In the last few weeks I have found myself discussing alternative forms of representing legislative information with three people – chatting with Eric Mill at the Sunlight Foundation about HTML microformats (look for a blog from him on this topic soon), Daniel Bennett regarding microdata, and Ari Hershowitz regarding JSON.

I thought I would try and open up a discussion on this topic by shedding some light on it. If we can strip away the discussion of the information model and instead focus on the representation, perhaps we can agree on which formats are better for which applications. Is a format a good storage format, a good transport format, a good analysis/programming format, or a good all-around format?

1) XML:

I’ll start with a simple example of a bill section using Akoma Ntoso:

<section xmlns="http://docs.oasis-open.org/legaldocml/ns/akn/3.0/CSD03" 
       id="{GUID}" evolvingId="s1">
    <num>§1.</num>
    <heading>Commencement </heading>
    <content> <p>This act will go into effect on 
       <date name=”effectiveDate” date="2013-01-01">January 1, 2013</date&gt;. 
    </p> </content>
</section> 

Of course, I am partial to XML. It’s a good all-around format. It’s clear, concise, and well supported. It works well as a good storage format, a good transport format, as well as being a good format of analysis and other uses. But it does bring with it a lot of complexity that is quite unnecessary for many uses.

2) HTML as Plain Text

For developers looking to parse out legislative text, plain text embedded in HTML using a <pre> element has long been the most useful format.

   <pre>
   §1. Commencement
   This act will go into effect on January 1, 2013.
   </pre>

It is a simple and flexible represenation. Even when an HTML represenation is provided that is more highly decorated, I have always invariably removed the decorations to leave behind this format.

However, in recent years, as governments open up their internal XML formats as part of their transparency intiatives, it’s becoming less necessary to write your own parsers. Still, raw text is a very useful base format.

3) HTML/HTML5 using microformats:

<div class="section" id="{GUID}" data-evolvingId="s1">
   <div>
      <span class="num">§1.</span> 
      <span class=”heading”>Commencement </span>
   </div>
   <div class="content"><p>This act will go into effect on 
   <time name="effectiveDate" datetime="2013-01-01">January 1, 2013 <time>. 
   </p></div>
</div>

As you can see, using HTML with microformats is a simple way of mapping XML into HTML. Currently, many legislative data sources that offer HTML content either offer bill text as plain text as I showed in the previous example or they decorate it in a way that masks much of the semantic meaning. This is largely because web developers are building the output to an appearance specification rather than to an information specification. The result is class names that better describe the appearance of the text than the underlying semantics. Using microformats preserves much of the semantic meaning through the use of the class attribute and other key attributes.

I personally think that using HTML with microformats is a good way to transport legislative data to consumers that don’t need the full capabilities of the XML representation and are more interested in presenting the data rather than analyzing or processing it. A simple transform could be used to take the stored XML and to then translate it into this form for delivery to a requestor seeking an easy-to-consume solution.

[Note: HTML5 now offers a <section> element as well as an <article> element. However, they’re not a perfect match to the legislative semantics of a section and an article so I prefer not to use them.]

4) HTML5 Microdata:

<div itemscope 
      itemtype="http://docs.oasis-open.org/legaldocml/ns/akn/3.0/CSD03#section" 
      itemid="urn:xcential:guid:{GUID}">
   <data itemprop="evolvingId" value="s1"/>
   <div>
      <span itemprop="num">§1.</span>
      <span itemprop="heading">Commencement </span>
   </div>
   <div itemprop="content"> <p>This act will go into effect on 
      <time itemprop="effectiveDate" time="2013-01-01">January 1, 2013 </time>.
   </p> </div>
</div>

Using microdata, we see more formalization of the annotation convention than microformats offers – which brings along additional complexity and requires some sort of naming authority which I can’t say I either really understand or see how it will happen. But it’s a more formalized approach and is part of the HTML5 umbrella. I doubt that microdata is a good way to represetn a full document. Rather, I see microdata better fitting in to the role of annotating specific parts of a document with metadata. Much like microformats, microdata is a good solution as a transport format to a consumer not interested in dealing with the full XML representation. The result is a format that is rich in semantic information and is also easily rendered to the user. However, it strikes me that the effort to more robustly handle namespaces only reinvents one of XMLs more confusing aspects, namely namespaces, in just a different way.

5) JSON

{
   "type": "http://docs.oasis-open.org/legaldocml/ns/akn/3.0/CSD03#section",
   "id": "{GUID}",
   "evolvindId": "s1",
    "num" : {
      "type": "http://docs.oasis-open.org/legaldocml/ns/akn/3.0/CSD03#num",
      "text": "§1."
   },
   "heading":  {
      "type": "http://docs.oasis-open.org/legaldocml/ns/akn/3.0/CSD03#heading",
      "text": "Commencement"
   },
   "content": {
      "type": "http://docs.oasis-open.org/legaldocml/ns/akn/3.0/CSD03#content",
      "text1": "This act will go into effect on "
      "date": {
         "type": "http://docs.oasis-open.org/legaldocml/ns/akn/3.0/CSD03#date",
         "date": "2013-01-01",
         "text": "January 1, 2013"
      }
      "text2": "."
   }
}

Quite obviously, JSON is great if you’re looking to easily load the information into your programmatic data structures and aren’t looking to present the information as-is to the user. This is a programmatic format primarily. Representing the full document in JSON might be overkill. Perhaps the role of JSON is for key parts of extracted metadata than the full document.

There are still other formats I could have brought up like RDFa, but I think my point has been made. There are many different ways of representing the same legislative model – each with its own strength and weaknesses. Different consumers have different needs. While XML is a good all-around format, it also brings with it some degree of sophistication and complexity that many information consumers simply don’t need to tackle. It should be possible, as a consumer, to specify the form of the information that most closely fits my need and have the legislative data source deliver it to me in that format.

[Note: In Akoma Ntoso, the format is called the “manifestation.” and is specified as part of the referencing specification.]

What do you think?

XML, HTML, JSON – Choosing the Right Format for Legislative Text

Legal Reference Resolvers

After my last blog post I received a lot of feedback. Thanks to everyone who contacted me with questions and comments. After all the interest in the subject, I think I will devote a few more blog posts to the subject of legal references. It is quite possibly the most important subject that needs to be tackled anyway. (And yes, Harlan, I will try and blog more often.)

Many of the questions I received asked how I envision the resolver working. I thought I would dive into this aspect some more by defining the role of the resolver:

The role of a reference resolver is to receive a reference to a document or a fragment thereof and to do whatever it takes to resolve it, returning the requested data to the requestor.

That definition defines the role of a resolver in pretty broad terms. Let’s break the role down into some discrete functions:

  1. Simple Redirection – Perhaps the most basic service to provide will be that of a reference redirector. This service will convert a standardized virtual reference into a non-standard URL that is understood by a proprietary repository available elsewhere on the web that can supply the data for the request. The redirection service allows a legacy repository to provide access to documents following its own proprietary referencing mechanism without having to adopt the standard referencing nomenclature. In this case, the reference redirector will serve as a front to the legacy repository, mapping the standard references into non-standard ones.

  2. Reference Canonicalization – There are often a number of different ways in which a reference to a legal document can be composed. This is partly because the manner in which legal documents are typically structured sometimes encourages both a flat and a hierarchical view of the same data. For instance, one tends to think of section in a flat model because sections are usually sequentially numbered. Often however, those sections are arranged in a hierarchical structure which allows an alternate hierarchical model to also be valid. Another reason for alternate references is the simple fact that there are all sorts of different ways of abbreviating the same thing – and it is impossible to get everyone around the world to standardize on abbreviations. So “section1″, “sec1″, “s1″, and the even more exotic “§1″ need to be treated synonymously. Also, let’s not forget about time. The requestor might be interested in the law as it existed on a particular date. The resulting reference will be formulated in a manner in which it starts being more of a document query rather than a document identifier. For instance, imagine a version of a section that became operational January 1, 2013. A request for the section that was in operation on February 1, 2013 will return that January 1 version if that version was still in operation on February 1 even though the operational date of the version is not February 1. (Akoma Ntoso calls the query case a virtual expression and differentiates it from the case where the date is part of the identifier)

    The canonicalization service will take any reference, perhaps vague or malformed, and will return one or more standardized references that precisely represent the documents that could be identified by the original reference – possibly along with a measure of confidence. I would imagine that official data services, providing authoritative legal documents, will most likely provide the canonicalization service.

  3. Repository Service – A legal library might provide both access to a document repository and an accompanying resolution service through which to access the repository. When this is the case, the resolver acts as an HTTP interface to the library, converting a virtual URL to an address of sorts in the document repository. This could simply involve converting the URL to a file path or it could involve something more exotic, requiring document extraction from a database or something similar.

    There are two separate use cases I can think of for the repository. The basic case is the repository as a read-only library. In this case, references are simply resolved, returning documents or fragments as requested. The second case is somewhat more complex and will exist within organizations tasked with developing legal resources – such as the organizations that draft legislation within the government. In this case, a more sophisticated read/write mechanism will require the resolver to work with technologies such as WebDAV which front for the database. This is a more advanced version of the solution we developed for use internally by the State of California.

  4. Resolver Routing – The most complex, and perhaps most difficult to achieve aspect, will be resolver routing. There is never going to exist a single resolver that can resolve every single legal reference in the world. There are simply too many jurisdictions to cover – in every country, state/province, county/parish, city/town, and every other body that produces legal documents. What if, instead, there was a way for resolvers to work together to return the document requested? While a resolver might handle some subset of all the references it receives on its own, for the cases it doesn’t know about, it might have some means to negotiate or pass on the request to other resolvers it knows about in order to return the requested data.

Not all resolvers will necessarily provide all the functions listed. How resolvers are discovered, how they reveal the functions they support, and how resolvers are tied together are all topics which will take efforts far larger than my simple blog to work out. But just imagine how many problems could be resolved if we could implement a resolving protocol that would allow legal references around the world to be resolved in a uniform way.

In my next blog, I’m going to return to the reference itself and take a look at the various different referencing mechanisms and services I have discovered in recent weeks. Some of the services implement some of the functions I have described above. I also want to discuss the difference between an absolute reference (including the domain name) and a relative reference (omitting the domain name) and why it is important that references stored in the document be relative.

Legal Reference Resolvers

Automating Legal References in Legislation

This is a blog I have wanted to write for quite some time. It addresses what I believe to be the single most important issue when modeling information for legal informatics. It is also, I believe, the most urgent aspect that we need to agree upon in order to promote legal informatics as a real emerging industry. Today, most jurisdictions are simply cobbling together short term solutions without much consideration to the big picture. With something this important, we need to look at the big picture first and come up with a lasting solution.

Citations, references, or links are a very important aspect of the law. Laws are inherently a web of interconnections and interdependencies. Correctly resolving those connections allows us to correctly interpret the law. Mistakes or ambiguities in how those connections are made is completely unacceptable.

I work on projects around the world as well as my work on the OASIS LegalDocumentML technical committee. As I travel to the four corners of the Earth, I am starting to see more clearly how this problem can be solved in a clean and extensible manner.

There are, of course, already many proposals to address this. The two I have looked at the most are both from Italy:
A Uniform Resource Name (URN) Namespace for Sources of Law (LEX)
Akoma Ntoso References (in the process of being standardized by OASIS)

My thoughts derive from these two approaches, both of which I have implemented in one way or another, with varying degrees of success. My earliest ideas were quite similar to the LEX-URN proposal by being based around URNs. However, with time Fabio Vitali at the University of Bologna has convinced me that the approach he and Monica Palmirani put forth with Akoma Ntoso using URLs is more practical. While URNs have their appeal, they really have not achieved critical mass in terms of adoption to be practical. Also, the general reaction I have gotten with LEX-URN encoded references has not been positive. There is just too much special encoding going on within them for them to be readable by the uninitiated.

Requirements

Before diving into this subject too deep, let’s define some basic requirements. In order to be effective, a reference must:
• Be unambiguous.
• Be predictable.
• Be adaptable to all jurisdictions, legal systems, and all the quirks that arise.
• Be universal in application and reach.
• Be implementable with current tools and technologies.
• Be long lasting and not tied to any specific implementation
• Be understandable to mere mortals like myself.

URI/IRI

URIs (Uniform Resource Identifiers) give us a way to identify resources in a computing system. We’re all familiar with URLs that allow us to retrieve pages across the web using hierarchical locations. Less well known are URNs which allow us to identify resources using a structured name which presumably will then be located using some form of a service to map the name to a location. The problem is, a well-established locating service has never come about. As a result, URNs have languished as an idea more than a tool. Both URLs and URNs are forms of URIs.

IRIs are a generalization of URIs to allow characters outside of the ASCII character set supported by normal URIs. This is important in jurisdictions that use more complex character than ASCII supports.

Given the current state of the art in software technology, basing references on URIs/IRIs makes a lot of sense. Using the URL/IRL variant is the safer and more universally accepted approach.

FRBR

FRBR is the Functional Requirements for Bibliographical Records. It is a conceptual entity-relationship model developed by librarians for modeling bibliographic information in databases. In recent years it has received a fair amount of attention for use as the basis for legal references. In fact, both the LEX-URN and the Akoma Ntoso models are based, somewhat loosely, on the model. At times, there is some controversy as to whether this model is appropriate or not. My intent is not to debate the merits of FRBR. Instead, I simply want to acknowledge that it provides a good overall model for thinking about how a legal reference should be constructed. In FRBR, there are four main entities:
1. Work – The work is the “what”, allowing us to specify what it is that we are referring to, independent of which version or format we are interested in.
2. Expression – The expression answers the “from when” question, allowing us to specify, in some manner, which version, variant, or time frame we are interested in.
3. Manifestation – The manifestation is the “which format” part, where we specify the format that we would like the information returned as.
4. Item – The item finally allows us to specify the “from where” part, when multiple sources of the information are available, that we want the information to come from.

That’s all I want to mention about FRBR. I want to pick up the four concepts and work from them.

What do we want?

Picking up the Akoma Ntoso model for specifying a reference as a URL, and mindful of our basic requirements, a useful model to reference a resource is as a hierarchical URL, starting by specifying the jurisdiction and then working hierarchically down to the item in question.

This brings me to the biggest hurdle I have come across when working with the existing proposals. It’s not terribly clear what a reference should be like when the item being referenced is a sub-part of a resource being modeled as an XML document. For instance, how would I refer to section 500 of the California Government Code? Without putting in too much thought, the answer might be something like /us-ca/codes/gov.xml#sec500, using a URL to identify the Government Code followed by a fragment identifier specifying section 500 of the Government Code. The LEX URN proposal actually suggests using the # fragment identifier, referring to the fragment as a partition. There are two problems with this solution though. First, any browser will interpret a reference using the fragment identifier as two parts – the part before the # fragment identifier showing the resource to be retrieved from the server and the part after the fragment identifier as an “id” to the item to scroll to. Retrieving the huge Government code when all we want is the one sentence in Section 500 is a terrible solution. The second problem is that it defines, possibly for all time, how a large document might have been constructed out of sub-documents. For example, is the US Code one very large document, does it consist of documents made out of the Titles, or as it is quite often modeled, is every section a different document? It would be better if references did not capture any part of this implementation decision. A better approach is to allow the “what” part of a reference to be specified as a virtual URL all the way down to whatever is wanted, even when the “what” is found deep inside an XML document in a current implementation. For example, the reference would better be specified as /us-ca/codes/gov/sec500. We’re not exposing in the reference where the document boundaries currently exist.

On to the next issue, what happens when there is more than one possible way to reference the same item? For example, the sections in California’s codes, as is usually the case, are numbered sequentially with little regard to the heading hierarchy above the sections. So a reference specified as /us-ca/codes/gov/sec500 is clear, concise, and unambiguous. It follows the manner in which sections are cited in the text. But /us-ca/codes/gov/title1/div3/chap6/sec500 is simply another way to identify the exact same section. This happens in other places too. /us-ca/statutes/2012/chap5 is the same document as /us-ca/bills/2011/sb730. So two paths identify the same document. Do we allow two identities? Do we declare one as the canonical reference and the other as an alternate? It’s not clear to me.

What about ambiguity? Mistakes happen and odd situations arise. Take a look at both Chapter 14s that exist in Division 6 of Title 1 of the California Government Code. There are many reasons why this happens. Sometimes it’s just a mistake and sometimes it’s quite deliberate. We have to be able to support this. In California, we disambiguate by using “qualifying language” which we embed somehow into the reference. The qualifying language specifies the last statute to create or amend the item needing disambiguation.

The From When do we want it?

A hierarchical path identifies, with some disambiguation, what it is we want. But chances are that what we want has varied over time. We need a way to specify the version we’re looking for or ask for the version that was valid at a specific point in time. Both the LEX URN and the Akoma Ntoso proposals for references suggest using an “@” sign around some nomenclature which identifies a version or date. (The Akoma Ntoso proposal adds the “:” sign as well)

A problem does arise with this approach though. Sometimes we find that multiple versions exist at a particular date. These versions are all in effect, but based on some conditional logic, only one might be operational at a particular time. How one deals with operational logic can be a bit tricky at times. That’s an open issue to me still.

Which Format do we want?

I find specifying the format to be relatively uncontroversial. The question is whether we specify the format using well established prefixes such as .pdf, .odt, .docx, .xml, and .html or whether we instead try to be more precise by embedding or encoding the MIME type into the reference. Personally, I think that simple extensions, while less rigorous and subject to unfortunate variations and overlaps, offer a far more likely to be adopted approach than trying to use the MIME type somehow. Simple generally wins over rigorous but more complex solutions.

The From Where should it come?

This last part, the from where should it come part, is something that is often omitted from the discussion. However, in a world where multiple libraries offering the same resource will quite likely exist, this is really important. Let’s take a look at the primary example once more. We want section 500 of the California Government Code. The reference is encoded as /us-ca/codes/gov/sec500. Where is this information to come from? Without a domain specified, our URL is a local URL so the presumption is that it will be locally resolved – the local system will find it, somehow. What if we don’t want to rely on a local resolution function? What if there are numerous sources of this data and we want to refer to one of them in particular. When we prepend the domain, aren’t we specifying from where we want the information to come from? So if we say http: //leginfo.ca.gov/us-ca/codes/gov/sec500, aren’t we now very precisely specifying the source of the information to be the official California source? Now, say the US Library of Congress decides to extend Thomas to offer state legislation. If we want to specify that copy, we would simply construct a reference as http: //thomas.loc.gov/us-ca/codes/gov/sec500. It’s the same URL after the domain is specified. If we leave the URL as simply /us-ca/codes/gov/sec500, we have a general reference and we leave it to the local system to provide the resolution service for retrieving and formating the information. We probably want to save references in a general fashion without a domain, but we certainly will need to refer to specific copies within the tools that we build.

Resolvers

The key to making this all work is having resolvers that can interpret standardized references and find a way to provide the correct response. It is important to realize that these URLS are all virtual URLs. They do not necessarily resolve to files that exist. It is the job of the resolving service to either construct the valid response, possibly by digging into database and files, or to negotiate with other resolvers that might do all or part of the job of providing a response. For example, imagine that Cornell University offers a resolver at http: //lii.cornell.edu. It might, behind the scenes, work with the official data source at http: //leginfo.ca.gov to source California legislation. Anyone around the world could use the Cornell resolver and be unaware of the work it is doing to source information from resolvers at the official sources around the world. So the local system would be pointed to the Cornell service and when the reference /us-ca/codes/gov/sec500 arose, the local system would defer to the LII service for resolution which in turn would defer to California’s official resolver. In this way, the resolvers would bear the burden of knowing where all the official data sources around the world are located.

Examples

So to end, I would like to sum up with some examples:

[Note that the links are proposals, using a modified and simplified form of the Akoma Ntoso proposal, rather than working links at this point]

/us-ca/codes/gov/sec500
– Get section 500 of the California Government Code. It’s up to the local service to decide where and how to resolve the reference.

http: //leginfo.ca.gov/us-ca/codes/gov/sec500
– Get Section 500 of the California Government Code from the official source in California.

http: //lii.cornell.edi/us-ca/codes/gov/sec500
– Get Section 500 of the California Government Code from Cornell’s LII and have them figure where to get the data from

/us-ca/codes/gov/sec500@2012-01-01
– Get Section 500 of the California Government Code as it existed on January 1, 2012

/us-ca/codes/gov/sec500@2012-01-01.pdf
– Get Section 500 of the California Government Code as it existed on January 1, 2012, in a PDF format

/us-ca/codes/gov/title1/div3/chap6/sec500
– Get Section 500 of the California Government Code, but the fully hierarchy is specified

My blog has gotten very long and I have only just started to scratch the surface. I haven’t addressed multilingual issues, alternate character sets, and a host of other issues at all. It should already be apparent that this is all simply a natural extension of the URLs we already use, but with sophisticated services underneath resolving to items other than simple files. Imagine for a moment how the field of legal informatics could advance if we could all agree to something this simple and comprehensive soon.

What do you think? Are there any other proposals, solutions, or prototypes out there that addresses this? How does the OASIS legal document ML work factor into this?

Automating Legal References in Legislation

Legal Informatics Glossary of Terms

I work with people from around the world on matters relating to legal informatics. One common issue we constantly face is the issue of terminology. We use many of the same terms, but the subtly of their definitions end up causing no end of confusion. To try and address this problem, I’ve proposed a number of times that we band together to define a common vocabulary, and when we can’t arrive at that, at least we can understand the differences that exist amongst us.

To get the ball rolling, I have started a wiki on GitHub and populated it with many of the terms I use in my various roles. Their definitions are a work-in-progress at this point. I am refining them as I find the time. However, rather than trying to build my own private vocabulary, I would like this to be a collaborative effort. To that end, I am inviting anyone with an interest in this to help build out the vocabulary by adding your own terms with definitions to the list and improving the ones I have started.

My legal informatics glossary of terms can be found in my public legal Informatics project at:

https://github.com/grantcv1/Legal-Informatics/wiki/Glossary

The wiki is a public project on GitHub. Right now, anyone can contribute. We’ll see how well this model works out. In order to contribute, you need to sign up for a free GitHub account and to master the basics of GitHub. For the purposes of managing a vocabulary, it’s quite simple. You will need to understand the markdown format of the text file that is behind the list. The builtin editor in GitHub makes editing the markdown quite simple. If you are so inclined, you can learn more about markdown at http://daringfireball.net/projects/markdown/syntax. GitHub will take care of all the versioning issues so feel free to edit the terminology file.

Eventually I would like to gather enough terms that common terms or clusters of terms can be identified. This will allow us to develop clearer and more understandable standards, tools, and documentation in the emerging areas of legal informatics.

Legal Informatics Glossary of Terms

“A Bill is a Bill is a Bill!”

I remember overhearing someone exclaim “A Bill is a Bill is a Bill” many years ago. What she meant is that a bill should be treated as a bill in the same way regardless of the system it flows through.

I’m going to borrow her exclamation to refer to something a bit different by rephrasing it slightly. Is a bill a Bill and a akn:bill? Lately I’ve been working in a number of different jurisdictions, and through my participation on OASIS and the LEX Summer School, with many other people from even more jurisdictions. It sometimes seems that everything bogs down when it comes to terminology. In the story of Babel, God is said to have divided the world into different languages to thwart the people’s plans to build the tower of Babel. Our problem isn’t that we’re all speaking different languages. Rather it’s that we all think we’re speaking the same language but our definitions are all different.

The way the human brain learns means that we learn incrementally. I remember someone once telling me that you can only know that which you almost knew before. When we try and learn a new thing, we cling to the parts we think we understand and our brain filters out the things we don’t. Ever learn a new word and then realize that its being used all around you every day – but you seemingly had never heard it before? So what happens when we recognize terms that are familiar, but fail to notice the details that would tell us that the usage isn’t quite what we expect? We misinterpret what we learn and then we make mistakes. Usually, through a slow process of trial and error, we learn from our mistakes and eventually we master the new subject. This takes time, limiting our effectiveness until our competency rises to an adequate level.

Let’s get back to the notion of a bill. Unfortunately in legislation, there is often a precise definition and an imprecise definition for a term and we don’t always know which is being used. What’s worse, if we’re not indoctrinated in the terminology, we might never realize that there is ambiguity in what we are hearing. For instance, I know of three different definitions for the word “bill”:

  1. The first usage is a very precise definition and it describes a document, introduced in a legislative body, that proposes to modify existing law. I will call this document a Bill (with a capital B). In California, all legislative documents which modify law are Bills. Subdivision (b) of Section 8 or Article 4 of the California Constitution defines it this way. At the federal level, the same definition of a bill exists, except that another document, the Joint Resolution, has similar powers to enact law, but this document is not a Bill.
  2. The second definition is the much looser definition that applies the word to any official documents acted upon by a legislature. I will use the term bill (with a lower-case b) when referring to this usage. In California, the precise term that is synonymous with bill is Measure. At the federal level, the precise term used is either Measure or Resolution. Of course, this opens up more confusion. In California, Measures are either Bills (and will affect law when enacted) or are Resolutions which express a statement from the legislature or house without directly affecting the law. So now the Federal Resolution is a synonym for a Measure while the California Resolution is a subclass of it. The Federal equivalent of a California Resolution is a Non-binding Resolution.
  3. The third definition is the Akoma Ntoso definition of a bill which I will refer to as the akn:bill. At first glance, it appears to equate with the precise definition of a Bill. It is defined as a document that affects the law upon enactment. But this definition breaks down. The akn:bill applies more broadly than the precise definition of a Bill but not as broadly as the imprecise definition of a bill. So an akn:bill applies to the federal Joint Resolution, Constitutional Amendments, California Initiatives along with the precise notion of a Bill.

I can summarize all this by saying that all Bills are akn:bills, and all akn:bills are bills, but not all bills are akn:bills, and not all akn:bills are Bills.

As if this isn’t confusing enough, other terms are even more overloaded. As I already alluded to, the term resolution is quite overloaded and the term amendment is even worse. Even the distinction between a bill and an act is unclear. Apparently a Bill, when it has passed one house, technically becomes an Act even before it passes the opposite house, but the imprecise term bill generally continues to be used.

To try and untangle all this confusion and allow us to communicate with one another more effectively, I have started a spreadsheet to collect all the terms and definitions I come across during my journey through this subject. My goal is to try and find the root concepts that are hidden underneath the vague and/or overloaded terminology we use and hopefully find some neutral terms which can be used to disambiguate communication between people coming from different legislative traditions. The beginning of my effort can be found at:

https://docs.google.com/spreadsheet/ccc?key=0ApeIHP2TOckZdHRPZ1pIeVRUcVpJZzZQT1BCSHFqd0E

Please feel free to contribute. Send me any deviations and additions you may have. And if you note an error in my understanding, please let me know. I am still learning this myself.

“A Bill is a Bill is a Bill!”

What is Transparency?

I’ve been thinking a lot about transparency lately. The disappearance of Malaysian Airline Flight 370 (MH370) provided an interesting case to look at – and some important lessons. Releasing data which requires great expertise to decipher isn’t transparency.

My boss, when I worked on process research at the Boeing Company many years ago, used to drill into me the difference between information and data. To him, data was raw – and meaningless unless you knew how to interpret it. Information, on the other hand, had the meaning applied so you could understand it – information, to him, was meaningful.

Let’s recall some of the details of the MH370 incident. The plane disappeared without a trace – for reasons that remain a mystery. The only useful information, after radar contact was lost, was a series of pings received by Inmarsat’s satellite. Using some very clever mathematics involving Doppler shifts, Inmarsat was able to use that data to plot a course for the lost plane. That course was revealed to the world and the search progressed. However, when that course failed to turn up the missing plane, there were increasingly angry calls for more transparency from Inmarsat – to reveal the raw data. Inmarsat’s response was that they had released the information, in the form of a plotted course, to the public and to the appropriate authorities, However, they chose to withhold the underlying data, claiming it wouldn’t be useful. The demands persisted, primarily from the press and the victims’ families. Eventually Inmarsat gave in and agreed to release the data. With great excitement, the press reported this as “Breaking News”. Then, a bewildered look seemed to come across everyone and the story quickly faded away. Inmarsat had provided the transparency in the form it was demanded, releasing the raw data along with a brief overview and the relevant data highlighted, but it still wasn’t particularly useful. We’re still waiting to hear if anyone will ever be able to find any new insights into whatever happened to MH370 using this data. Most likely though, that story has run its course – you simply need Inmarsat’s expertise to understand the data.

There is an important lesson to be learned – for better or worse. Raw data can be released, but without the tools and expertise necessary to intepret it, it’s meaningless. Is that transparency? Alternatively, raw data can be interpreted into meaningful information, but that opens up questions as to the honesty and accuracy of the interpretation. Is that transparency? It’s very easy to hide the facts in plain sight – by delivering it in a convoluted and indecipherable data format or by selectively interpreting it to tell an incomplete story. How do we manage transparency to achieve the objective of providing the public with an open, honest, and useful view of government activities?

Next week, I want to describe my vision for how government information should be made public. I want to tackle the conflicting needs of providing information that is both unfiltered yet comprehensible. While I don’t have the answers, I do want to start the process of clarifying what better transparency is really going to achieve.

What is Transparency?