The Sun is Rising on Akoma Ntoso — and LegisPro too!

Two great news piece of news this week! First, the documentation for Akoma Ntoso has now been officially released by OASIS. Second, we’re announcing the latest version of  our LegisPro drafting platform for Akoma Ntoso, codenamed “Sunrise”.

After several years of hard work, we’ve made a giant step towards our goal of setting an international XML standard for legal documents. You can find the documents at the OASIS LegalDocML website. A special thanks to Monica Palmirani and Fabio Vitali at the University of Bologna for their leadership in this endeavour.

legispro250Later this week, Xcential will be announcing and showing the latest version of “Sunrise” version of LegisPro, at both NALIT in Annapolis, Maryland and at the LEX Summer School in Ravenna, Italy. This new version represents a long-planned change to Xcential’s business model. While we have a thriving enterprise business, we’re now focusing on also providing more affordable solutions for smaller governments.

Part of our plan is to foster an open community of providers around the Akoma Ntoso standard for legislative XML. With Akoma Ntoso now in place as a standard, we’re looking for ways to provide open interfaces such that cooperative tools and technologies can be developed. One of my goals at this years summer school in Ravenna is to begin outlining the open APIs that will enable this vision.

LegisPro

The new edition of LegisPro will be all about providing the very best options:

  1.  It will provide a word processing like drafting capability your drafters demand — along with the real capabilities you need:
    • We’re not talking about merely providing a way to style a word processing document to look like legislation.
    • We’re talking about providing easy ways to define the constructs you need for your legislative traditions, such as–
      • a configurable hierarchy,
      • configurable tagging of important information,
      • configurable numbering rules,
      • configurable metadata,
      • oh, and configurable styles too.
    • We’re talking about truly understanding your amending traditions and providing the mechanisms to support them, such as–
      • configurable track changes, because we understand that a word processor’s track changes are not enough,
      • as-published page and line markers, because we understand your real need for page and line numbers and that a word processor’s page and line numbering is not that,
      • robust typography, because we know there’s a quite a difference between the casual correspondence a word processor is geared for and the precision demanded in documents that represent laws and regulations.
  2. It will be as capable as we can make it — for real-world use rather than just a good demo:
    • We’re not talking about trying to sell you a cobbled together suite of tools we built for other customers.
    • We’re talking about working with specialists in all the sub-fields of legal informatics to provide best-of-breed options that work with our tools.
    • We’re talking about making as many options available to you as we know there is no one-size-fits-all answer in this field.
    • We’re talking about an extensible architecture that will support on-board plug-ins as well as server-side web-services.
    • We’re talking about providing a platform of choices rather than a box of pieces.
  3. It will be as affordable as we can possibly make it:
    • We’re talking about developing technologies that have been designed to be easily configured to meet a wide variety of needs.
    • We’re talking about using a carefully chosen set of technologies to minimize both your upfront cost and downstream support challenges.
    • We’re talking about providing a range of purchasing options to meet your budgetary constraints as best we can.
    • We’re talking about finding a business model that allows us to remain profitable — and spreads the costs of developing the complex technologies required by this field as widely and fairly as possible.
  4. It is as future-proof as we can possibly make it:
    • We’re not talking about trying to sell you on a proprietary office suite.
    • We’re talking about using a carefully curated set of technologies that have been selected as they represent the future of application development — not the past — including:
      • GitHub’s Electron which allows us to provide both a desktop and a web-based option, (This is the same technology used by Slack, WordPress, Microsoft’s Visual Studio Code, and hundreds of other modern applications.)
      • Node.js which allows us to unify client-side and server-side application development with “isomorphic JavaScript,”
      • JavaScript 6 (ECMAScript 2015) which allows us to provide a truly modern, unified, and object-oriented programming environment,
      • Angular and other application frameworks that allow us to focus on the pieces and not how they will work together,
      • CSS3 and LESS that allows us to provide state-of-the-art styling technologies for the presentation of XML documents,
      • the entire XML technology stack that is critical for enabling an information-centric rather than document-centric system as is appropriate for the 21st century,
      • and, of course, using the Akoma Ntoso schema for legislative XML to provide the best model for sharing data, information, tools, and other technologies. It’s truly a platform to build an industry on.
  5. It is as open as we can possibly make it:
    • We’re not talking about merely using an API published by a vendor attempting to create a perception of openness by publishing an API with “open” in the name.
    • We’re talking about building on a full suite of open source tools and technologies coming from vendors such as Google, GitHub, and even Microsoft.
    • We’re talking about using non-proprietary protocols such as HTTP and WebDAV.
    • We’re talking about providing an open API to our tools that will also work with tools of other vendors that support Akoma Ntoso.
    • And, while we must continue to be a profitable product vendor, we will still provide the option of open access to our GitHub repositories to our customers and partners. (We’ll even accept pull requests)

Our goal is to be the very best vendor in the legislative and regulatory space, providing modern software that helps make government more efficient, more transparent and more responsive. We want to provide you with options that are affordable, capable, and planned for the future. We want to do whatever we can to allay your fears of vendor lock-in by supporting open standards, open APIs, and open technologies. We want to foster an Akoma Ntoso-based industry of cooperative tools and technologies as we know that doing so will be in the best interests of everyone — customers, product vendors, service providers, and the people who support them. As someone once told me many years ago, if you focus on making the pie as large as you can, the crumbs left on the knife will be plenty enough for you.

Either come by our table at NALIT in Annapolis or join us for the Akoma Ntoso Developer’s Conference in Ravenna at the conclusion of the LEX Summer School to learn more. If neither of these options will work for you, you can always learn more at Xcential.com or by sending email to info@xcential.com.

The Sun is Rising on Akoma Ntoso — and LegisPro too!

Can GitHub be used to manage legislation?

Every so often, someone suggests that GitHub would be a great way to manage legislation. Usually, we roll our eyes at the naïve suggestion and that is that.

However, there are a good many similarities that do deserve consideration. What if the amending process was supported by a tool that, while maybe not GitHub, worked on the same principles?

My company, Xcential, built the amending solution for the California Legislature, using a process we like to call Amendments in Context. With this process, a proposed revision of a bill is drafted and then the amendments necessary to produce that revision are extracted as an amendment document. That amendment document, which really becomes an enumeration of proposed changes in a report, is then submitted to the committee for approval. If approved, the revised document that was drafted earlier then becomes the next official version of the bill. This process differs from the traditional process in which an amendment document is drafted, itemizing changes to be made. When the committee approves the amendments, there is a mad rush, usually overnight, to implement (or execute) those amendments to the last version in order to produce the next version. Our Amendments in Context automated approach is more accurate and largely eliminates the overnight bottleneck of having to execute approved amendments before the start of business the following day.

Since implementing this system for California, we’ve been involved in a number of other jurisdictions and efforts that deal with the amending process. This has given us quite a good perspective on the various ways in which bill amendments get handled.

As software developers ourselves, we’ve often been struck by how similar the bill amendment process is to the software development process — the very thing that invariably leads to the suggestion that GitHub could be a great repository for legislation. With this all in mind, let’s compare and contrast the bill amending process with the software development process using GitHub.

(We’ll make suitable procedural simplifications to keep the example clear)

BILL AMENDING PROCESS SOFTWARE ENHANCEMENT PROCESS
Begin a proposed amendment Begin a proposed enhancement
Create a copy of the last version of a bill. In the U.S. and other parts of the world that still use page and line numbers, cleverly annotated page and line number information from the last publication must be included. This copy will be modified to reflect the proposed changes. Create a new software branch. This branch will be modified to implement the proposed enhancement
Make the proposed changes using redlining, showing the changes as insertions and deletions. Carefully craft the changes to obey the drafting rules and any political sensitivities regarding how the changes are shown. Make the proposed changes to the software — testing and debugging as needed.
Redlining Software
Generate the amendment Prepare to commit
The amendment generator examines the redlining (insertions and deletions), carefully grouping changes together to produce a minimized set of amendments. These amendments are expressed in the familiar, at least in the U.S., “on page X, line Y, strike ‘this’ and replace with ‘that'” or something along those lines. (For jurisdictions that don’t use an amendment generator, a manually written amendment document, enumerating the amendments, is the starting point) A differencing engine compares the source code with the prior version, carefully grouping changes together to produce a minimized set of hunks. If you use a tool such as SourceTree by Atlassian, these hunks are shown as source code with lines to be removed and lines to be inserted.
Amendment Hunks
Save the amendment document alongside the revised bill with redlining Commit the changes to GitHub
Vote on the amendments Submit for review
The amendment document goes to committee where it is proposed and then either adopted or rejected. The procedures here may differ, depending on the jurisdiction. In California, multiple competing amendment documents (known as instruction amendments) may be proposed at any one time, but only one can be adopted and it is adopted in whole. Other jurisdictions allow multiple amendment documents to be adopted and individual amendments with any amendment document to be adopted or rejected. The review board considers the proposed enhancement and decides whether or not to incorporate them into the next release. They may choose to adopt the entire enhancement or they may choose to adopt only certain aspects of it.
Execute the amendment Merge into mainline
In California, because only single whole amendments can be adopted, executing an adopted amendment is quite easy — the redlined version of the bill simply becomes the next version. However, in most jurisdictions, this isn’t so easy. Instead, each amendment must be applied to a new copy of the bill, destined to become the next version. Conflicts that arise must be resolved following a prescribed set of procedures. Incorporating an enhancement into the mainline involves a merge of the enhancement branch into the mainline. If an enhancement is not adopted in whole, then approved changes may be cherry picked. When conflicts between different sets of approved enhancements occur, GitHub requires manual intervention to resolve the issues. This process is generally a lot less formal than resolving conflicts in legislation.

So, as you can see, there are a lot of similarities between amending a bill and implementing a software enhancement. The basic process is essentially identical. However, the differences lie in the details.

Git is designed specifically for the software development process. The legislative process has quite a different set of requirements and traditions which must be met. It simply isn’t possible to bend and distort the legislative process to fit the model prescribed by Git. However, that doesn’t mean that something like GitHub is out of the question. What if there was a GitHub for Legislation — a tool with an associated repository, modeled after Git and GitHub, specifically designed for managing legislation?

This example shows the power of adopting XML for drafting legislation. With properly designed XML, legislation becomes a vast store of machine-readable information that can meet the 21st century challenges of accuracy, efficiency, and transparency. We’re not just printing paper anymore — we’re managing digital information.

Can GitHub be used to manage legislation?

Coming soon!!! A new web-based editor for Akoma Ntoso

I’ve been working hard for a long time — building an all new web-based editor for Akoma Ntoso. We will be showing it for the first time at the upcoming Akoma Ntoso LEX Summer School in Washington D.C.

Unlike our earlier AKN/Editor, this editor is a pure XML editor designed from the ground up using the XML capabilities that modern browsers possess. This editor is much more robust, more precise,  and is very scalable.

NewEditor

Basic Features

  1. Configurable XML models — including Akoma Ntoso and USLM
  2. Edit full documents or portions of large documents
  3. Flexible selection and editing regardless of XML structure
  4. Built-in redlining (change tracking) supporting textual AND structural changes
  5. Browse document sources with drag-and-drop.
  6. Full undo & redo
  7. Customizable attribute editor
  8. Search and replace
  9. Modular architecture to allow for extensive customization

Underlying Technology

  1. XML-based editing component
    • DOM 4 support
    • XPath Support
    • CSS Styling
    • Sophisticated event model
  2. HTTP-based resolver architecture for retrieving documents
    • Interpret citations
    • Deference URLs
    • WebDAV adaptors to document repositories
    • Query repositories with XQuery or databases with SQL
  3. AngularJS-based User Interface using HTML5
    • Component modules for easy customization
  4. XML repository for storing documents
    • Integrate any XML repository
    • Built-in support for eXist-db
  5. Validation & Publishing
    • XML Schema validator
    • XSL-FO publishing

We’ll reveal a lot more at the LEX Summer School later this month! If you’re interested in our QuickStart beta program, drop me a note at grant.vergottini@xcential.com.

Coming soon!!! A new web-based editor for Akoma Ntoso

Akoma Ntoso (LegalDocML) is now available for public review

It’s been many years in the making, but the standardised version of Akoma Ntoso is now finally in public review. You can find the official announcement here. The public review started May 7th and will end on June 5th — which is quite a short time for something so complex.

I would like to encourage everything to take part in this review process, as short as it is. It’s important that we get good coverage from around the world to ensure that any use cases we missed get due consideration. Instructions for how to comment can be found here.

Akoma Ntoso is a complex standard and it has many parts. If you’re new to Akoma Ntoso, it will probably be quite overwhelmed. To try and cut through that complexity, I’m going to try and give a bit of an overview of what the documentation covers, and what to be looking for.

There are four primary documents

  1. Akoma Ntoso Version 1.0 Part 1: XML Vocabulary — This document is the best place to start. It’s an overview of Akoma Ntoso and describes what all the pieces are and how they fit together.
  2. Akoma Ntoso Version 1.0 Part 2: Specifications — This is the reference material. When you want to know something specific about an Akoma Ntoso XML element or attribute, this is the document to go to. In contains very detailed information derived from the schema itself. Also included with this is the XML schema (or DTD if you’re still inclined to use DTDs). and a good set of examples from around the world.
  3. Akoma Ntoso Naming Convention Version 1.0. This document describes two very interrelated and important aspects of the proposed standard — how identifitiers are assigned to elements and how IRI-based (or URI-based) references are formed. There is a lot of complexity in this topic and it was the subject to numerous meetings and an interesting debate at the Coco Loco restaurant in Ravenna, Italy, one evening while being eaten by mosquitoes.
  4. Akoma Ntoso Media Type Version 1.0 — This fourth document describes a proposed new media type that will be used when transmitting Akoma Ntoso documents.

This is a lot of information to read and digest in a very short amount of time. In my opinion, the best way to try and evaluate Akoma Ntoso’s applicability to your jurisdiction is as follows:

  • First, look at the basic set of tags used to define the document hierarchy. Is this set of tags adequate. Keep in mind that the terminology might not always perfectly align with your terminology. We had to find a neutral terminology that would allow us to define a super-set of the concepts found throughout the world.
  • If you do find that specific elements you need are missing, consider whether or not that concept is perhaps specific to your jurisdiction. If that is the case, take a look at the basic Akoma Ntoso building blocks that are provided. While we tried to provide a comprehensive set of elements and attributes, there are many situations which are simply too esoteric to justify the additional tag bloat in the basic standard. Can the building blocks be used to model those concepts?
  • Take a look at the identifiers and the referencing specification. These parts are intended to work together to allow you to identifier and access any provision in an Akoma Ntoso document. Are all your possible needs met with this? Implicit in this design is a resolver architecture — a component that parses IRI references (think of them as URLs) and maps to specific provisions. Is this approach workable?
  • Take a look at the basic metadata requirements. Akoma Ntoso has a sophisticated metadata methodology behind it and this involves quite a bit of indirection at times. Understand what the basic metadata needs are and how you would model your jurisdictions metadata using this.
  • Finally, if you have time, take a look at the more advanced aspects of Akoma Ntoso. Consider how information related to the documents lifecycle and workflow might be modeled within the metadata. Consider your change management needs and whether or not the change management capabilities of Akoma Ntoso could be adapted to fit. If you work with complex composite documents, take a look at the mechanisms Akoma Ntoso provides to assemble composite documents.

Yes, there is a lot to digest in just a few weeks. Please provide whatever feedback you can.

We’re also now in the planning stages for a US LEX Summer School. If you’ve followed my blog over the years, you’ll know that I am a huge fan of the LEX Summer School in Ravenna, Italy — I’ve been every year for the past five years. This year, Kirsten Gullikson and I convinced Monica and Fabio to bring the Summer School to Washington D.C. as well. The summer school will be held the last week of July 2015 at George Mason University. The class size will be limited to just 30, so be sure to register early once registration opens. If you want to hear me rattle on at length about this subject, this is the place to go — I’ll be one of the teachers. The Summer School will conclude with a one day Akoma Ntoso Conference on the Saturday. We’ll be looking for papers. I’ll send out a blog with additional information as soon as it’s finalized.

You may have noticed that I’ve been blogging a lot less lately. Well, that’s because I’ve been heads down for quite some time. We’ll soon be in a position to announce our first full Akoma Ntoso product. It’s an all new web-based XML editor that builds on our experiences with the HTML5 based AKN/Editor (LegisPro Web) that we built before.

This editor is composed of four main parts.

  1. First, there is a full XML editing component that works with pure XML — allowing it to be quite scalable and very XML precise. It implements complex track changes capabilities along with full redo/undo. I’m quite thrilled how it has turned out. I’ve battled for years with XMetaL’s limitations and this was my opportunity to properly engineer a modern XML editor.
  2. Second, there is a sophisticated resolver technology which acts as the middleware, implementing the URI scheme I mentioned earlier — and interfacing with local and remote document resources. All local document resources are managed within an eXist-db repository.
  3. Third, there is the Akoma Ntoso model. The XML editing component is quite schema/model independent. This allows it to be used with a wide variety of structured documents. The Akoma Ntoso model adapts the editor for use with Akoma Ntoso documents.
  4. And finally, there is a very componentised application which ties all the pieces together. This application is written as an AngularJS-based single page application (SPA). In an upcoming blog I’ll detail the trials and tribulations of learning AngularJS. While learning AngularJS has left me thinking I’m quite stupid at times, the goal has been to build an application that can easily be extended to fit a wide variety of structured editing needs. It’s important that all the pieces be defined as modules that can either be swapped out for bespoke implementations or complemented with additional capabilities.

Our current aim is to have the beta version of this new editor available in time for the Summer School and Akoma Ntoso conference — so I’ll be very heads down through most of the summer.

Akoma Ntoso (LegalDocML) is now available for public review

Achieving Five Star Open Data

A couple weeks ago, I was in Ravenna, Italy at the LEX Summer School and follow-on Developer’s Workshop. There, the topic of a semantic web came up a lot. Despite cooling in the popular press in recent years, I’m still a big believer in the idea. The problem with the semantic web is that few people actually get it. At this point, it’s such an abstract idea that people invariably jump to the closest analog available today and mistake it for that.

Tim Berners-Lee (@timberners_lee), the inventor of the web and a big proponent of linked data, has suggested a five star deployment scheme for achieving open data — and what ultimately will be a semantic web. His chart can be thought of as a roadmap for how to get there.

Take a look at today’s Data.gov website. Everybody knows the problem with it — it’s a pretty wrapper around a dumping ground of open data. There are thousands and thousands of data sets available on a wide range of interesting topics. But, there is no unifying data model behind all these data dumps. Sometimes you’re directed to another pretty website that, while well-intentioned, hides the real information behind the decorations. Sometimes you can get a simple text file. If you’re lucky, you might even find the information in some more structured format such as a spreadsheet or XML file. Without any unifying model and with much of the data intended as downloads rather than as an information service, this is really still Tim’s first star of open data — even though some of the data is provided as spreadsheets or open data formats. It’s a good start, but there’s an awful long way to go.

So let’s imagine that a better solution is desired, providing information services, but keeping it all modest by using off-the-shelf technology that everyone is familiar with. Imagine that someone with the authority to do so, takes the initiative to mandate that henceforth, all government data will be produced as Excel spreadsheets. Every memo, report, regulation, piece of legislation, form that citizens fill out, and even the U.S. Code will be kept in Excel spreadsheets. Yes, you need to suspend disbelief to imagine this — the complications that would result would be incredibly tough to solve. But, imagine that all those hurdles were magically overcome.

What would it mean if all government information was stored as spreadsheets? What would be possible if all that information was available throughout the government in predictable and permanent locations? Let’s call the system that would result the Government Information Storehouse – a giant information repository for information regularized as Excel spreadsheets. (BTW, this would be the future of government publishing once paper and PDFs have become relics of the past.)

How would this information look? Think about a piece of legislation, for instance. Each section of the bill might be modeled as a single row in the spreadsheet. Every provision in that section would be it’s own spreadsheet cell (ignoring hierarchical considerations, etc.) Citations would turn into cell references or cell range references. Amending formulas, such as “Section 1234 of Title 10 is amended by…” could be expressed as a literal formula — a spreadsheet formula. It would refer to the specific cell in the appropriate U.S. Code Title and contain programmatic instructions for how to perform the amendment. In short, lots of once complex operations could be automated very efficiently and very precisely. Having the power to turn all government information into a giant spreadsheet has a certain appeal — even if it requires quite a stretch of the imagination.

Now imagine what it would mean if selected parts of this information were available to the public as these spreadsheets – in a regularized and permanent way — say Data.gov 2.0 or perhaps, more accurately, as Info.gov. Think of all the spreadsheet applications that would be built to tease out knowledge from the information that the government is providing through their information portal. Having the ability to programmatically monitor the government without having to resort to complex measures to extract the information would truly enable transparency.

At this point, while the linkages and information services give us some of the attributes of Tim’s four and five star open data solutions, but our focus on spreadsheet technology has left us with a less than desirable two star system. Besides, we all know that having the government publish everything as Excel spreadsheets is absurd. Not everything fits conveniently into a spreadsheet table to say nothing of the scalability problems that would result. I wouldn’t even want to try putting Title 42 of the U.S. Code into an Excel spreadsheet. So how do we really go about achieving this sort of open data and the efficiencies it enables — both inside and outside of government?

In order to realize true four and five star solutions, we need to quickly move on to fulfilling all the parts of Tim’s five star chart. In his chart, a three star solution replaces Excel spreadsheets with an open data format such as a comma separated file. I don’t actually care for this ordering because it sacrifices much to achieve the goal of having neutral file formats — so lets move on to full four and five star solutions. To get there, we need to become proficient in the open standards that exist and we must strive to create ones where they’re missing. That’s why we work so hard on the OASIS efforts to develop Akoma Ntoso and citations into standards for legal documents. And when we start producing real information services, we must ensure that the linkages in the information (those links and formulas I wrote about earlier), exist to the best extent possible. It shouldn’t be up to the consumer to figure out how a provision in a bill relates to a line item in some budget somewhere else — that linkage should be established from the get-go.

We’re working on a number of core pieces of technology to enable this vision and get to full five star open data. We integrating XML repositories and SQL databases into our architectures to give us the information storehouse I mentioned earlier. We’re building resolver technology that allows us to create and manage permanent linkages. These linkages can be as simple as citation references or as complex as instructions to extract from or make modifications to other information sources. Think of our resolver technology as akin to the engine in Excel than handles cell or range references, arithmetic formulas, and database lookups. And finally, we’re building editors that will resemble word processors in usage, but will allow complex sets of information to be authored and later modified. These editors will have many of the sophisticated capabilities such as track changes that you might see in a modern word processor, but underneath you will find a complex structured model rather than the ad hoc data structures of a word processor.

Building truly open data is going to be a challenging but exciting journey. The solutions that are in place today are a very primitive first step. Many new standards and technologies still need to be developed. But, we’re well on our way.

Achieving Five Star Open Data

2014 LEX Summer School & Developer’s Workshop

This week I attended the 2014 LEX Summer School and the follow-on Developer’s Workshop put on by the University of Bologna in Ravenna, Italy. This is the fifth year that I have participated and the third year that we have had the developer’s extension.

It’s always interesting to me to see how the summer school has evolved from the last year and who attends. As always, the primary participation comes from Europe – as one would expect. But this year’s participants also came from as far away as the U.S., Chile, Taiwan, and Kenya. The United States had a participant from the U.S. House of Representatives this year, aside from me. In past years, we have also had U.S. participation from the Library of Congress, Lexus Nexus, and, of course, Xcential. But, I’m always disappointed that there isn’t greater U.S. participation. Why is this? It seems that this is a field where the U.S. chooses to lag behind. Perhaps most jurisdictions in the U.S. are still hoping that Open Office or Microsoft Office will be a good solution. In Europe, the legal informatics field is looking beyond office productivity tools towards all the other capabilities enabled by drafting in XML — and looking forward to a standardized model as a basis for a more cost effective and innovative industry.

As I already mentioned, this was our third developer’s workshop. It immediately followed the summer school. This year the developer’s workshop was quite excellent. The closest thing I can think of in the U.S. is NALIT, which I find to be more of a marketing-oriented show and tell. This, by comparison, is a far more cozy venue. We sit around, in a classroom setting, and have a very open and frank share and discuss meeting. Perhaps it’s because we’ve come to know one another through the years, but the discussion this year was very good and helpful.

We had presentations from the University of Bologna, the Italian Senate, the European Parliament, the European Commission, the UK National Archives, the US House of Representatives, and myself representing the work we are doing both in general and for the US House of Representatives. We closed out the session with a remote presentation from Jim Mangiafico on the work he is doing translating to Akoma Ntoso for the UK National Archives. (Jim, if you don’t already know, was the winner of the Library of Congress’ Akoma Ntoso challenge earlier this year.)

What struck me this year is how our shared experiences are influencing all our projects. There has been a marked convergence in our various projects over the last year. We all now talk about URI referencing schemes, resolvers to handle them, and web-based editors to draft legislation. And, much to my delight, this was the first year that I’m not the only one looking into change tracking. Everybody is learning that differencing isn’t always the best way to compute amendments – often you need to better craft how the changes are recorded.

I can’t wait to see the progress we make by this time next year. By then, I’m hoping that Akoma Ntoso will be well established as a standard and the first generation of tools will have started to mature. Hopefully our discussion will have evolved from how to build tools towards how to achieve higher levels of compliance with the standard.

I also hope that we will have greater participation from the U.S.

2014 LEX Summer School & Developer’s Workshop

Building a browser-based XML Editor

Don’t forget the 2014 U.S. House Legislative Data and Transparency Conference this week.

I’m now hard at work on our second generation web-based XML editor. In my blog last week, I talked about the need for and complexities of change tracking in a legislative editor. In this blog, I want to describe more of the overall motivation.

A couple years ago, we built an HTML5-based legislative editor for Akoma Ntoso. We learned a lot from the effort and had some success with a couple customers whose needs matched the capabilities of the editor. The editor was built to use and exploit, to the fullest extent, many of the new APIs added to modern browsers to support HTML5. We found that, by focusing on HTML5, a lot of the complexities of dealing with browser quirks and incompatibilities were a thing of the past – allowing us to focus on building the editing functions.

The editor worked by transforming the XML document into a close representation of the XML, expressed as HTML5 tags. Using HTML5 features such as the @contenteditable attribute along with modern CSS, the browser DOM, selection ranges, drag and drop, and a WebDAV repository API, we were able to implement a fairly sophisticated web-based legislative editor.

But, not everything went smoothly. The first problem involved the complexity of mapping all the intricacies of XML into an HTML5 representation, and then maintaining that representation in the browser. Part of the difficulty stems from the fact the HTML5 is not specifically an XML dialect – and browsers tend to do HTML5 things that aren’t always XML friendly. The HTML5 DOM is deliberately rather loose and forgiving (it’s a big part of why HTML was successful in the first place) while XML demands a very precise and rigid DOM.

The second problem we faced was scalability. While the HTML5 representation wasn’t all that heavyweight, the bigger problem was the transformation cost going back and forth between HTML5 and XML. We sometimes deal with very large legislation and laws. In our bigger cases, the cost of transformation was simply unreasonable.

So what is the solution? Well, early last year we started experimenting with using a browser to render XML documents with a CSS directly – without any transform into HTML. Most modern browsers now do this very well. For the most part, we were able to achieve an acceptable rendition in the browser without any transformation.

There were a few drawbacks to this approach. For one, links were dead – they didn’t inherently do anything. Likewise, implementing something like the HTML @style attribute didn’t just naturally work. Before we could entertain the notion of a pure XML-based editor built within the XML infrastructure in the browser, we had to find a solution that would allow us to enrich the XML sufficiently to allow it to behave like an HTML page.

Another problem arose in that our prior web-based editor relied upon the @contenteditable feature of HTML. That is an HTML feature rather than a browser feature. Using XML as our base environment, we no longer had access to this facility. This wasn’t a total loss as our need for a rich change tracking environment required us to find a better approach that @contenteditable offered anyway.

With solutions to the major problems behind us, we started to take a look at the other goals for the editor:

  • Track Changes – This was the subject of my blog last week. For us, track changes is crucial in any editor targeted at legislation – and it must work at both the structural and textual level equally well. We use the feature for two things – redlining changes as is common in the U.S. and the automatic generation of amendment documents (amendments in context). Differencing can get you part way there – but it excludes the ability to adequately craft the changes in a way that deal with political sensitivities. Track Changes is a very complex feature which must be built into the very core of the editor – tacking it on later will be very difficult, if not impossible.
  • Scalability – Scalability is very important to our applications. We need to support very large documents. Even when we deal with document fragments, we need to allow those fragments to be very large. Our approach is to create editing islands within a large document loaded into the browser. This amounts to only building the editing superstructure around the parts of the document being edited rather than the whole document. It’s like building the scaffolding around only the floors being worked on in a skyscraper rather than trying to envelope the entire building in scaffolding.
  • Modularity – We’re building a number of very different applications currently – all of which require XML editing. To allow this variability, our new XML editor is written as a web-based component rather than a full-fledged application. Despite its complexity, on the surface it’s deceivingly simple. It has no user interface at all aside from the editing canvas. It’s completely driven by a well thought out JavaScript API. Adding the editor to a document is very simple. A single link, added to the bottom of the XML document, adds the editor to the document. With this component, we’re able to include it within all of the applications we are building.
  • Configurability – We need to support a number of different models – not just Akoma Ntoso. To achieve this, an XML-based configuration file is used to define the behaviors for any XML model. Elements can be defined as read-only, templates can be defined (or derived), and even the track changes behavior can be configured for individual elements. The sophistication being defined within the configuration files is to allow us to model all the variants of legislative models we have encountered without the need for extensive programming-level customization.
  • Browser Support – We’re pushing the envelope when it comes to browser support. Our current focus is on Google’s Chrome browser. Support for all the browsers aside from Internet Explorer should be relatively easy. Our experience has shown that the browsers are now quite similar. Internet Explorer is the one exception – in this particular area. Years ago, IE was the best browser when it came to XML support. While IE had many other compatibility issues, particularly with CSS, it led the way in supporting XML. However, while Microsoft has made tremendous strides moving forward to match the other browsers and modern standards, they’ve neglected XML. Their circa 1999 legacy capabilities for XML do no match modern standards and are quite deficient. Hopefully, this is something that will soon be rectified.

It’s not all smooth sailing. I have been finding a number of surprising issues with Google Chrome. For instance, whitespace management is a bit fudged at times. Chrome thinks nothing of adding the occasional non-breaking space to maintain whitespace when editing the DOM. What’s worse – it will inexplicably convert this into a text node that reads ” ” after a while. This is a character entity that is not defined in XML. I have to work hard to constantly reverse this odd behavior.

All in all, I’m excited by this new approach to building a web-based XML editor. It’s a substantial increase in sophistication over our prior web-based XML editor. This editor will be far more robust, scalable, and configurable in comparison to our prior editor and other editors we have worked on. While we still have a way to go in our development, we’ve found solutions to all the risky issues. It’s a future-looking approach – support can only get better. It doesn’t rely on compatibility modes or any other remnants of prior eras in web technology. This approach is really working out quite nicely for us.

Building a browser-based XML Editor