Akoma Ntoso (LegalDocML) is now available for public review

It’s been many years in the making, but the standardised version of Akoma Ntoso is now finally in public review. You can find the official announcement here. The public review started May 7th and will end on June 5th — which is quite a short time for something so complex.

I would like to encourage everything to take part in this review process, as short as it is. It’s important that we get good coverage from around the world to ensure that any use cases we missed get due consideration. Instructions for how to comment can be found here.

Akoma Ntoso is a complex standard and it has many parts. If you’re new to Akoma Ntoso, it will probably be quite overwhelmed. To try and cut through that complexity, I’m going to try and give a bit of an overview of what the documentation covers, and what to be looking for.

There are four primary documents

  1. Akoma Ntoso Version 1.0 Part 1: XML Vocabulary — This document is the best place to start. It’s an overview of Akoma Ntoso and describes what all the pieces are and how they fit together.
  2. Akoma Ntoso Version 1.0 Part 2: Specifications — This is the reference material. When you want to know something specific about an Akoma Ntoso XML element or attribute, this is the document to go to. In contains very detailed information derived from the schema itself. Also included with this is the XML schema (or DTD if you’re still inclined to use DTDs). and a good set of examples from around the world.
  3. Akoma Ntoso Naming Convention Version 1.0. This document describes two very interrelated and important aspects of the proposed standard — how identifitiers are assigned to elements and how IRI-based (or URI-based) references are formed. There is a lot of complexity in this topic and it was the subject to numerous meetings and an interesting debate at the Coco Loco restaurant in Ravenna, Italy, one evening while being eaten by mosquitoes.
  4. Akoma Ntoso Media Type Version 1.0 — This fourth document describes a proposed new media type that will be used when transmitting Akoma Ntoso documents.

This is a lot of information to read and digest in a very short amount of time. In my opinion, the best way to try and evaluate Akoma Ntoso’s applicability to your jurisdiction is as follows:

  • First, look at the basic set of tags used to define the document hierarchy. Is this set of tags adequate. Keep in mind that the terminology might not always perfectly align with your terminology. We had to find a neutral terminology that would allow us to define a super-set of the concepts found throughout the world.
  • If you do find that specific elements you need are missing, consider whether or not that concept is perhaps specific to your jurisdiction. If that is the case, take a look at the basic Akoma Ntoso building blocks that are provided. While we tried to provide a comprehensive set of elements and attributes, there are many situations which are simply too esoteric to justify the additional tag bloat in the basic standard. Can the building blocks be used to model those concepts?
  • Take a look at the identifiers and the referencing specification. These parts are intended to work together to allow you to identifier and access any provision in an Akoma Ntoso document. Are all your possible needs met with this? Implicit in this design is a resolver architecture — a component that parses IRI references (think of them as URLs) and maps to specific provisions. Is this approach workable?
  • Take a look at the basic metadata requirements. Akoma Ntoso has a sophisticated metadata methodology behind it and this involves quite a bit of indirection at times. Understand what the basic metadata needs are and how you would model your jurisdictions metadata using this.
  • Finally, if you have time, take a look at the more advanced aspects of Akoma Ntoso. Consider how information related to the documents lifecycle and workflow might be modeled within the metadata. Consider your change management needs and whether or not the change management capabilities of Akoma Ntoso could be adapted to fit. If you work with complex composite documents, take a look at the mechanisms Akoma Ntoso provides to assemble composite documents.

Yes, there is a lot to digest in just a few weeks. Please provide whatever feedback you can.

We’re also now in the planning stages for a US LEX Summer School. If you’ve followed my blog over the years, you’ll know that I am a huge fan of the LEX Summer School in Ravenna, Italy — I’ve been every year for the past five years. This year, Kirsten Gullikson and I convinced Monica and Fabio to bring the Summer School to Washington D.C. as well. The summer school will be held the last week of July 2015 at George Mason University. The class size will be limited to just 30, so be sure to register early once registration opens. If you want to hear me rattle on at length about this subject, this is the place to go — I’ll be one of the teachers. The Summer School will conclude with a one day Akoma Ntoso Conference on the Saturday. We’ll be looking for papers. I’ll send out a blog with additional information as soon as it’s finalized.

You may have noticed that I’ve been blogging a lot less lately. Well, that’s because I’ve been heads down for quite some time. We’ll soon be in a position to announce our first full Akoma Ntoso product. It’s an all new web-based XML editor that builds on our experiences with the HTML5 based AKN/Editor (LegisPro Web) that we built before.

This editor is composed of four main parts.

  1. First, there is a full XML editing component that works with pure XML — allowing it to be quite scalable and very XML precise. It implements complex track changes capabilities along with full redo/undo. I’m quite thrilled how it has turned out. I’ve battled for years with XMetaL’s limitations and this was my opportunity to properly engineer a modern XML editor.
  2. Second, there is a sophisticated resolver technology which acts as the middleware, implementing the URI scheme I mentioned earlier — and interfacing with local and remote document resources. All local document resources are managed within an eXist-db repository.
  3. Third, there is the Akoma Ntoso model. The XML editing component is quite schema/model independent. This allows it to be used with a wide variety of structured documents. The Akoma Ntoso model adapts the editor for use with Akoma Ntoso documents.
  4. And finally, there is a very componentised application which ties all the pieces together. This application is written as an AngularJS-based single page application (SPA). In an upcoming blog I’ll detail the trials and tribulations of learning AngularJS. While learning AngularJS has left me thinking I’m quite stupid at times, the goal has been to build an application that can easily be extended to fit a wide variety of structured editing needs. It’s important that all the pieces be defined as modules that can either be swapped out for bespoke implementations or complemented with additional capabilities.

Our current aim is to have the beta version of this new editor available in time for the Summer School and Akoma Ntoso conference — so I’ll be very heads down through most of the summer.

Akoma Ntoso (LegalDocML) is now available for public review

Achieving Five Star Open Data

A couple weeks ago, I was in Ravenna, Italy at the LEX Summer School and follow-on Developer’s Workshop. There, the topic of a semantic web came up a lot. Despite cooling in the popular press in recent years, I’m still a big believer in the idea. The problem with the semantic web is that few people actually get it. At this point, it’s such an abstract idea that people invariably jump to the closest analog available today and mistake it for that.

Tim Berners-Lee (@timberners_lee), the inventor of the web and a big proponent of linked data, has suggested a five star deployment scheme for achieving open data — and what ultimately will be a semantic web. His chart can be thought of as a roadmap for how to get there.

Take a look at today’s Data.gov website. Everybody knows the problem with it — it’s a pretty wrapper around a dumping ground of open data. There are thousands and thousands of data sets available on a wide range of interesting topics. But, there is no unifying data model behind all these data dumps. Sometimes you’re directed to another pretty website that, while well-intentioned, hides the real information behind the decorations. Sometimes you can get a simple text file. If you’re lucky, you might even find the information in some more structured format such as a spreadsheet or XML file. Without any unifying model and with much of the data intended as downloads rather than as an information service, this is really still Tim’s first star of open data — even though some of the data is provided as spreadsheets or open data formats. It’s a good start, but there’s an awful long way to go.

So let’s imagine that a better solution is desired, providing information services, but keeping it all modest by using off-the-shelf technology that everyone is familiar with. Imagine that someone with the authority to do so, takes the initiative to mandate that henceforth, all government data will be produced as Excel spreadsheets. Every memo, report, regulation, piece of legislation, form that citizens fill out, and even the U.S. Code will be kept in Excel spreadsheets. Yes, you need to suspend disbelief to imagine this — the complications that would result would be incredibly tough to solve. But, imagine that all those hurdles were magically overcome.

What would it mean if all government information was stored as spreadsheets? What would be possible if all that information was available throughout the government in predictable and permanent locations? Let’s call the system that would result the Government Information Storehouse – a giant information repository for information regularized as Excel spreadsheets. (BTW, this would be the future of government publishing once paper and PDFs have become relics of the past.)

How would this information look? Think about a piece of legislation, for instance. Each section of the bill might be modeled as a single row in the spreadsheet. Every provision in that section would be it’s own spreadsheet cell (ignoring hierarchical considerations, etc.) Citations would turn into cell references or cell range references. Amending formulas, such as “Section 1234 of Title 10 is amended by…” could be expressed as a literal formula — a spreadsheet formula. It would refer to the specific cell in the appropriate U.S. Code Title and contain programmatic instructions for how to perform the amendment. In short, lots of once complex operations could be automated very efficiently and very precisely. Having the power to turn all government information into a giant spreadsheet has a certain appeal — even if it requires quite a stretch of the imagination.

Now imagine what it would mean if selected parts of this information were available to the public as these spreadsheets – in a regularized and permanent way — say Data.gov 2.0 or perhaps, more accurately, as Info.gov. Think of all the spreadsheet applications that would be built to tease out knowledge from the information that the government is providing through their information portal. Having the ability to programmatically monitor the government without having to resort to complex measures to extract the information would truly enable transparency.

At this point, while the linkages and information services give us some of the attributes of Tim’s four and five star open data solutions, but our focus on spreadsheet technology has left us with a less than desirable two star system. Besides, we all know that having the government publish everything as Excel spreadsheets is absurd. Not everything fits conveniently into a spreadsheet table to say nothing of the scalability problems that would result. I wouldn’t even want to try putting Title 42 of the U.S. Code into an Excel spreadsheet. So how do we really go about achieving this sort of open data and the efficiencies it enables — both inside and outside of government?

In order to realize true four and five star solutions, we need to quickly move on to fulfilling all the parts of Tim’s five star chart. In his chart, a three star solution replaces Excel spreadsheets with an open data format such as a comma separated file. I don’t actually care for this ordering because it sacrifices much to achieve the goal of having neutral file formats — so lets move on to full four and five star solutions. To get there, we need to become proficient in the open standards that exist and we must strive to create ones where they’re missing. That’s why we work so hard on the OASIS efforts to develop Akoma Ntoso and citations into standards for legal documents. And when we start producing real information services, we must ensure that the linkages in the information (those links and formulas I wrote about earlier), exist to the best extent possible. It shouldn’t be up to the consumer to figure out how a provision in a bill relates to a line item in some budget somewhere else — that linkage should be established from the get-go.

We’re working on a number of core pieces of technology to enable this vision and get to full five star open data. We integrating XML repositories and SQL databases into our architectures to give us the information storehouse I mentioned earlier. We’re building resolver technology that allows us to create and manage permanent linkages. These linkages can be as simple as citation references or as complex as instructions to extract from or make modifications to other information sources. Think of our resolver technology as akin to the engine in Excel than handles cell or range references, arithmetic formulas, and database lookups. And finally, we’re building editors that will resemble word processors in usage, but will allow complex sets of information to be authored and later modified. These editors will have many of the sophisticated capabilities such as track changes that you might see in a modern word processor, but underneath you will find a complex structured model rather than the ad hoc data structures of a word processor.

Building truly open data is going to be a challenging but exciting journey. The solutions that are in place today are a very primitive first step. Many new standards and technologies still need to be developed. But, we’re well on our way.

Achieving Five Star Open Data

2014 LEX Summer School & Developer’s Workshop

This week I attended the 2014 LEX Summer School and the follow-on Developer’s Workshop put on by the University of Bologna in Ravenna, Italy. This is the fifth year that I have participated and the third year that we have had the developer’s extension.

It’s always interesting to me to see how the summer school has evolved from the last year and who attends. As always, the primary participation comes from Europe – as one would expect. But this year’s participants also came from as far away as the U.S., Chile, Taiwan, and Kenya. The United States had a participant from the U.S. House of Representatives this year, aside from me. In past years, we have also had U.S. participation from the Library of Congress, Lexus Nexus, and, of course, Xcential. But, I’m always disappointed that there isn’t greater U.S. participation. Why is this? It seems that this is a field where the U.S. chooses to lag behind. Perhaps most jurisdictions in the U.S. are still hoping that Open Office or Microsoft Office will be a good solution. In Europe, the legal informatics field is looking beyond office productivity tools towards all the other capabilities enabled by drafting in XML — and looking forward to a standardized model as a basis for a more cost effective and innovative industry.

As I already mentioned, this was our third developer’s workshop. It immediately followed the summer school. This year the developer’s workshop was quite excellent. The closest thing I can think of in the U.S. is NALIT, which I find to be more of a marketing-oriented show and tell. This, by comparison, is a far more cozy venue. We sit around, in a classroom setting, and have a very open and frank share and discuss meeting. Perhaps it’s because we’ve come to know one another through the years, but the discussion this year was very good and helpful.

We had presentations from the University of Bologna, the Italian Senate, the European Parliament, the European Commission, the UK National Archives, the US House of Representatives, and myself representing the work we are doing both in general and for the US House of Representatives. We closed out the session with a remote presentation from Jim Mangiafico on the work he is doing translating to Akoma Ntoso for the UK National Archives. (Jim, if you don’t already know, was the winner of the Library of Congress’ Akoma Ntoso challenge earlier this year.)

What struck me this year is how our shared experiences are influencing all our projects. There has been a marked convergence in our various projects over the last year. We all now talk about URI referencing schemes, resolvers to handle them, and web-based editors to draft legislation. And, much to my delight, this was the first year that I’m not the only one looking into change tracking. Everybody is learning that differencing isn’t always the best way to compute amendments – often you need to better craft how the changes are recorded.

I can’t wait to see the progress we make by this time next year. By then, I’m hoping that Akoma Ntoso will be well established as a standard and the first generation of tools will have started to mature. Hopefully our discussion will have evolved from how to build tools towards how to achieve higher levels of compliance with the standard.

I also hope that we will have greater participation from the U.S.

2014 LEX Summer School & Developer’s Workshop

Look how far legal informatics has come – in just a few years

Back in 2001 when I started in the legal informatics field, it seemed we were all alone. Certainly, we weren’t – there were many similar efforts underway around the country and around the world. But, we felt alone. All the efforts were working in isolation – making similar decisions and learning similar lessons. This was the state of the field, for the most part, for the next 6 to 8 years. Lots of isolated progress, but few opportunities to share what we had learned and build on what others knew.

In 2010, I visited the LEX Summer School, put on by the University of Bologna in Ravenna, Italy. What became apparent to me was just how isolated the various pockets of innovation had become around the world. There was lots of progress, especially in Europe, but legal informatics, as an industry, was still in a fledgling state – it was more of an academic field than a commercial industry. In fact, outside of academic circles, the term legal informatics was all but meaningless. When I wrote my first blog in 2011, I looked forward to the day when the might be a true Legal Informatics industry.

Now, just a few years later, it’s stunning how far we have come. Certainly, we still have far to travel, but now we’re all working together towards common goals rather than working alone towards the same, but isolated, goals. I thought I would spend this week’s blog to review just how far we have come.

  1. Working together
    We have come together in a number of important dimensions:

    • First of all, consider geography. This is a small field, but around the world we’re now all very much connected together. We routinely meet, share ideas, share lessons, and share expertise – no matter which continent we work and reside on.
    • Secondly, consider our viewpoints. There was once a real tension between the transparency camp, government, external industry, and academia. If you participated at the 2014 Legislative Data and Transparency conference a few weeks ago in Washington D.C., one of the striking things was how little tension remains between these various viewpoints. We’re all now working towards a common set of goals.
  2. Technology
    • I remember when we used to question whether XML was the right technology. The alternatives were to use Open Office or Microsoft Office, basing the legislative framework around office productivity tools. Others even proposed using relational database technology along with a forms-based interface. Those ideas have now generally faded away – XML is the clear choice. And faking XML by relying on the fact that the Open Document Format (ODF) or Office Open XML formats are based on XML, just isn’t credible anymore. XML means more than just relying on an internal file format that your tools happen to use – it means designing information models specifically to solve the challenges of legal informatics.
    • I remember when we used to debate how references should be managed. Should we use file paths? Should we use URNs? Should we use URLs? Today the answer is clear – we’re all settling around logical URLs with resolvers, sometimes federated, to stitch together a web of interconnected references. Along with this decision has been the basic assumption that web-based solutions are the future – desktop applications no longer have a place in a modern solution.
    • Consider database technology. We used to have three choices – use the file system, try and adapt mature but ill-fitting relational databases, or take a risk with emerging XML databases. Clearly XML databases were the future – but was it too early? Not anymore! XML database technology, along with XQuery, have come a long way in the past few years
  3. Standards
    Standards are what will create an industry. Without them, there is little opportunity to re-use – a necessary part of allowing cost-effective products to be built. After a few false starts over the years, we’re now on the cusp of having industry standards to work with . The OASIS LegalDocML (Akoma Ntoso) and LegalCiteM technical committees are hard at work on developing those standards. Certainly, it will be a number of years before we will see the all the benefits of these standards, but as they come to fruition, a real industry can emerge.
  4. Driving Forces
    Ten years ago, the motivation for using XML was to replace outdated drafting systems, often cobbled together on obsolete mainframes, that sorely needed replacement. The needs were all internal. Now, that has all changed. The end result is no longer a paper document which can be ordered from the “Bill Room” found in the basement of the Capitol building. It’s often not even a PDF rendition of that document. The new end result is information which needs to be shared in a timely and open way in order to achieve the modern transparency objectives, like the DATA Act, that have been mandated. This change in expectations is going to revolutionize how the public works with their representatives to ensure fair and open government.
  5. In the past dozen years, things sure have changed. Credit must be given to the Monica Palmirani (@MonicaPalmirani) and Fabio Vitali at the University of Bologna – an awful lot of the progress pivots around their initiatives. However, we’ve all played a part in creating an open, creative, and cooperative environment for Legal Informatics to thrive as more than just an academic curiosity – as a true industry with many participants working collaboratively and competitively to innovate and solve the challenges ahead.

Look how far legal informatics has come – in just a few years

Imagining Government Data in the 21st Century

After the 2014 Legislative Data and Transparency conference, I came away both encouraged and a little worried. I’m encouraged by the vast amount of progress we have seen in the past year, but at the same time a little concerned by how disjointed some of the initiatives seem to be. I would rather see new mandates forcing existing systems to be rethought rather than causing additional systems to be created – which can get very costly over time. But, it’s all still the Wild Wild West of computing.

What I want to do with my blog this week is try and define what I believe transparency is all about:

  1. The data must be available. First and foremost, the most important thing is that the data be provided at the very least – somehow, anyhow.
  2. The data must be provided in such a way that it is accessible and understandable by the widest possible audience. This means providing data formats that can be read by ubiquitous tools and, ensuring the coding necessary to support all types of readers including those with disabilities.
  3. The data must be provided in such a way that it should be easy for a computer to digest and analyze. This means using data formats that are easily parsed by a computer (not PDF, please!!!) and using data models that are comprehensible to widest possible audience of data analysts. Data formats that are difficult to parse or complex to understand should be discouraged. A transparent data format should not limit the visibility of the data to only those with very specialized tools or expertise.
  4. The data provided must be useful. This means that the most important characteristics of the data must be described in ways that allow it to be interpreted by a computer without too much work. For instance, important entities described by the data should be marked in ways that are easily found and characterized – preferably using broadly accepted open standards.
  5. The data must be easy to find. This means that the location at which data resides should be predictable, understandable, permanent, and reliable. It should reflect the nature of the data rather than the implementation of the website serving the data. URLs should be designed rather than simply fallout from the implementation.
  6. The data should be as raw as possible – but still comprehensible. This means that the data should have undergone as little processing as possible. The more that data is transformed, interpreted, or rearranged, the less like the original data it becomes. Processing data invariably damages its integrity – whether intentional or unintentional. There will always be some degree of healthy mistrust in data that has been over-processed.
  7. The data should be interactive. This means that it should be possible to search the data at its source – through both simple text search and more sophisticated data queries. It also means that whenever data is published, there should be an opportunity for the consumer to respond back – be it simple feedback, a formal request for change, or some other type of two way interaction.

How can this all be achieved for legislative data? This is the problem we are working to solve. We’re taking a holistic approach by designing data models that are both easy to understand and can be applied throughout the data life cycle. We’re striving to limit data transformations by designing our data models to present data in ways that are both understandable to humans and computers alike. We are defining URL schemes that are well thought out and could last for as long as URLs are how we find data in the digital era. We’re defining database solutions that allow data to not only be downloaded, but also searched and queried in place. We’re building tools that will allow the data to not only be created but also interacted with later. And finally, we’re working with standards bodies such as the LegalDocML and LegalCiteM technical committees at OASIS to ensure well thought out world wide standards such as Akoma Ntoso.

Take a look at Title 1 of the U.S. Code. If you’re using a reasonably modern web browser, you will notice that this data is very readable and understandable – its meant to be read by a human. Right click with the mouse and view the source. This is the USLM format that was released a year ago. If you’re familiar with the structure of the U.S. Code and you’re reasonably XML savvy, you should feel at ease with the data format. It’s meant to be understandable to both humans and to computer programs trying to analyze it. The objective here is to provide a single simple data model that is used from initial drafting all the way through publishing and beyond. Rather than transforming the XML into PDF and HTML forms, the XML format can be rendered into a readable form using Cascading Style Sheets (CSS). Modern XML repositories such as eXist allow documents such as this to be queried as easily as you would query a table in a relational database – using a query language called XQuery.

This is what we are doing – within the umbrella of legislative data. It’s a start, but ultimately there is a need for a broader solution. My hope is that government agencies will be able to come together under a common vision for our information should be created, published, and disseminated – in order to fulfill their evolving transparency mandates efficiently. As government agencies replace old systems with new systems, they should design around a common open framework for transparent data rather building new systems in the exact same footprint as the old systems that they demolish. The digital era and transparency mandates that have come with it demand new thinking far different than the thinking of the paper era which is now drawing to a close. If this can be achieved, then true data transparency can be achieved.

Imagining Government Data in the 21st Century

What is Transparency?

I’ve been thinking a lot about transparency lately. The disappearance of Malaysian Airline Flight 370 (MH370) provided an interesting case to look at – and some important lessons. Releasing data which requires great expertise to decipher isn’t transparency.

My boss, when I worked on process research at the Boeing Company many years ago, used to drill into me the difference between information and data. To him, data was raw – and meaningless unless you knew how to interpret it. Information, on the other hand, had the meaning applied so you could understand it – information, to him, was meaningful.

Let’s recall some of the details of the MH370 incident. The plane disappeared without a trace – for reasons that remain a mystery. The only useful information, after radar contact was lost, was a series of pings received by Inmarsat’s satellite. Using some very clever mathematics involving Doppler shifts, Inmarsat was able to use that data to plot a course for the lost plane. That course was revealed to the world and the search progressed. However, when that course failed to turn up the missing plane, there were increasingly angry calls for more transparency from Inmarsat – to reveal the raw data. Inmarsat’s response was that they had released the information, in the form of a plotted course, to the public and to the appropriate authorities, However, they chose to withhold the underlying data, claiming it wouldn’t be useful. The demands persisted, primarily from the press and the victims’ families. Eventually Inmarsat gave in and agreed to release the data. With great excitement, the press reported this as “Breaking News”. Then, a bewildered look seemed to come across everyone and the story quickly faded away. Inmarsat had provided the transparency in the form it was demanded, releasing the raw data along with a brief overview and the relevant data highlighted, but it still wasn’t particularly useful. We’re still waiting to hear if anyone will ever be able to find any new insights into whatever happened to MH370 using this data. Most likely though, that story has run its course – you simply need Inmarsat’s expertise to understand the data.

There is an important lesson to be learned – for better or worse. Raw data can be released, but without the tools and expertise necessary to intepret it, it’s meaningless. Is that transparency? Alternatively, raw data can be interpreted into meaningful information, but that opens up questions as to the honesty and accuracy of the interpretation. Is that transparency? It’s very easy to hide the facts in plain sight – by delivering it in a convoluted and indecipherable data format or by selectively interpreting it to tell an incomplete story. How do we manage transparency to achieve the objective of providing the public with an open, honest, and useful view of government activities?

Next week, I want to describe my vision for how government information should be made public. I want to tackle the conflicting needs of providing information that is both unfiltered yet comprehensible. While I don’t have the answers, I do want to start the process of clarifying what better transparency is really going to achieve.

What is Transparency?

2014 Legislative Data and Transparency Conference

Last week, I attended the 2014 Legislative Data and Transparency Conference in Washington D.C. This one day conference was put on the by the U.S. House of Representatives and was held at the U.S. Capitol.

The conference was quite gratifying for me. A good part of the presentation related to projects my company, Xcential, and I are working on. This included Ralph Seep’s talk on the new USML format for the U.S. Code and the Phase 2 Codification System, Sandy Strokof’s mention of Legislative Lookup and Linking and the new Amendment Impact Program (which was demonstrated by Harlan Yu (@harlanyu)), Daniel Bennett’s (@cititzencontact) talk on the legal citation technical committee, and finally Monica Palmirani (@MonicaPalmirani) and Fabio Vitali, from the University of Bologna, talked about Akoma Ntoso and Legal Document ML. We’ve made an awful lot of progress over the past few years.

Other updates included:

  • Andrew Weber (@atweber) gave an update on the progress being made by the Congress.gov website over at the Library of Congress. While still in beta, this site has now essentially replaced the older Thomas site.
  • There was an update on modernization plans over at the Government Printing Office including increased reliance on XML technologies. While it is good to see the improvements planned, their plans didn’t seem to be well integrated with all the other initiatives underway. Perhaps this will be clearer when more details are revealed.
  • Kirsten Gullickson (@GullicksonK) gave an update on the developments over at the Office of the Clerk of the House.

One very interesting talk was the winner of the Library of Congress Data Challenge, Jim Mangiafico (@mangiafico), describing his work building a transform from the existing Bill XML into Akoma Ntoso.

After lunch, there were a series of flash talks on various topics:

Later in the afternoon, Ali Ahmad (@aliahmad) chaired a panel discussion/update on the DATA Act and its implications on the legislative Branch. Perhaps it’s still to really understand what the effect of the DATA act will be – it looks like it is going to have a slow rollout over the next few years.

Anne Washington chaired a panel discussion on bringing the benefits of paper retrieval to electronic records. This discussion centered around a familiar theme, making data useful to someone who is searching for it – so it can be found, and when it is found, used.

All in all, it was a very useful day spent at the nation’s Capitol.

2014 Legislative Data and Transparency Conference