Data Coalition

Blog

<< First < Prev ... 3 4 5 6 7 ... Next > Last >>

How Data Standards Will Modernize Our Government’s Regulatory Regime

February 22, 2018 9:00 AM | Anonymous member (Administrator)

Regulatory technology solutions, or “RegTech,” will enable automated regulatory reporting, derive insights from regulatory information, and share information on complex markets and products. Progress in RegTech has been seen in the private sector as access to quality data improves. This progress has not been mirrored in the private sector here in the United States, but the potential to improve government efficiency is not far off. Our RegTech Data Summit will be a unique opportunity to dive into the policy area and hear how RegTech solutions, empowered by the Legal Entity Identifier (LEI) and Standard Business Reporting (SBR), are transforming reporting and compliance relationships.

RegTech solutions have been defined by The Institute of International Finance as “the use of new technologies to solve regulatory and compliance requirements more effectively and efficiently.” PwC defines it as “the innovative technologies that are addressing regulatory challenges in the financial services world,” and notes that the financial environment is “ripe for disruption by emerging RegTechs” due to the growth of automation and rising compliance costs. As such, RegTech relies on the quality of its inputs – data.

Currently, regulatory information that is collected from regulated industries continues to be of poor quality and inconsistent. For example, the Securities and Exchange Commission has failed to police the quality and consistency of the corporate financial data it collects – which has made it much more difficult for RegTech companies to use that data to deliver insights to investors. The lack of consistent and quality data impedes the development of RegTech solutions in the United States.

Other developed countries are showing that once their regulators collect standardized data, their RegTech industries can deliver new value. For instance, Australian software vendors used the standardized data structure to build new compliance solutions. Using these solutions, Australian companies can now comply with at least five different regulatory reporting regimes within one software environment. In 2014-15 fiscal year, SBR was saving Australian companies over $1 billion per year through automation.

Our inaugural RegTech Data Summit on Wednesday, March 7, will explore how data standards, like SBR, can be adopted across our regulatory reporting regimes to enable RegTech to evolve into a thriving, sustainable industry and policy initiatives, such as the Financial Transparency Act (H.R. 1530), currently pending in the House of Representatives – directs the eight financial regulators to collect and publish the information they collect from financial entities in an open data form, electronically searchable, downloadable in bulk, and without license restrictions.

The Summit will be a unique opportunity to connect with agency leaders, Congressional allies, regulated industries, and tech companies who are defining this emerging policy area.

Attendees will have the opportunity to hear from leaders in the government and private sector. Headline speakers include: SEC Commissioner Piwowar; Joe Lonsdale, Founder of Palantir and OpenGov, Partner, 8vc; Giancarlo Pellizzari, Head of Banking Supervision Data Division, European Central Bank; and Stephan Wolf, CEO, Global LEI Foundation.

Summit-goers will see first-hand how RegTech solutions can modernize compliance across all regulatory regimes. Three simultaneous demos will take place: a Regulation demo, a Technology demo, and a Data demo.

On the Regulation demo stage you will see the legal drafting tool LegisPro, which automates the drafting and amending process of regulations and legal materials to be displayed and redlined (like Microsoft track changes). Once regulations are natively drafted as data, instead of just as documents, regulated industries will be able to digest them, and comply with them, much faster.

In the Technology demo, with Auditchain, you will see the company’s continuous audit and real time reporting ecosystem for enterprise and token statistics disclosure. The continuous audit technology provides the highest levels of audit assurance through decentralized consensus based audit procedure under the DCARPE™ Protocol. The blockchain technology is revolutionizing how enterprises report to stakeholders and regulators.

Last, but not least, the Data demo session will explore how non-proprietary identifiers, such as the LEI, will make regulatory information accessible to the public, open, and available for anyone to bulk download. Dozens of agencies around the world, including the U.S. Commodity Futures Trading Commission and the Consumer Financial Protection Bureau, now use the LEI to identify regulated companies.

All in all, Summit attendees will hear from over 25 RegTech experts who are moving the chains in this new tech space, both domestically and internationally.

The Summit will explore four key themes:

Why RegTech solutions require data standardization.

How regulatory agencies can maximize the promise of such solutions by coordinating changes in Regulations, Technology, and Data.

Why old-fashioned document-based disclosures impedes higher quality data.

Explore how, in the long-term, regulatory agencies can create an entirely new paradigm for RegTech by embracing exiting regimes like Standard Business Reporting (SBR).

If you’re a data enthusiast or eager to learn how RegTech solutions can solve the challenges facing our financial regulators or regulated entities, join us March 7 at the Capital Hilton to learn how policy initiatives like the Financial Transparency Act and SBR will deliver changes, create a new RegTech industry, and generate new opportunities to apply emerging technologies like blockchain.

This article contains excerpts from Donnelley Financial Solutions’ upcoming white paper, How Data will Determine the Future of RegTech.

[/et_pb_text][/et_pb_column][/et_pb_row][/et_pb_section]

Read more
SEC Proposes to Transform Corporate Cover Pages from Documents into Data

October 16, 2017 9:00 AM | Anonymous member (Administrator)

Last week the U.S. Securities and Exchange Commission proposed its first expansion of open corporate data in nearly nine years.

Here’s where the new proposal came from, what it means, and why it matters.

For Financial Information: a Phantom Menace

As longtime Data Coalition supporters know, the SEC was an early adopter of open data. In February 2009, the agency finalized a rule that requires all U.S. public companies to report their financial statements using the eXtensible Business Reporting Language (XBRL) format.

In theory, as the CFA Institute explained last month, XBRL should have allowed companies to automate some of their manual compliance processes. And XBRL should have made corporate financial information easier for investors, markets, and the SEC itself to absorb.

But it hasn’t gone well.

As a new report from our sister organization, the Data Foundation, explains, the SEC chose an overly-complex structure for public companies’ financial statements. Worse, the complex structure doesn’t match the accounting rules that public companies must follow (see page 22). As a result, it is unnecessarily hard for companies to create these open-data financial statements, and for investors and tech companies serving them to absorb such data.

These difficulties have led some in Congress to conclude that the SEC shouldn’t require most companies to file financial statements as open data at all. This is the wrong reaction. The problem isn’t open data; the problem is the way the SEC has implemented it.

Fortunately, the SEC is taking steps to fix its financial statement data. Last year, the agency signaled it may finally upgrade from the current duplicative system, in which companies report financial information twice (once as a document, then again as XBRL), and replace it with a single submission that is both human- and machine-readable. And the recommendations of the Data Quality Committee (see page 37) show how the complex structure could be demystified.

And then there’s last week’s announcement.

For Non-Financial Information: A New Hope

Financial statements are an important part, but only a part, of the information public companies must file with the SEC. Companies also report their basic corporate characteristics, stock structures, executive compensation, subsidiaries, and much more. Partly because its XBRL mandate for financial statements has been a failure, the agency never got around to transforming any of this non-financial information into open data.

For the last decade, companies have continued to report their non-financial information – even information that could easily be structured, like tables and check marks – as plain, unstructured text, in electronic documents that "mimic yesterday's paper ones."

Here is the cover page of Google’s annual SEC report. It should be expressed as open data. It isn’t. But change is coming.

Last year, the Data Coalition filed a comment letter recommending that the SEC should replace all these documents with data. We recommended (on pages 2 and 12) that the SEC should begin with the “cover pages” of the main corporate disclosure forms. Cover pages include every company’s name, address, size, public float, and regulatory categories.

The information reported on cover pages is very basic and fundamental for U.S. capital markets – and yet it is not available as structured data, creating surprising inefficiencies.

To create a list of all well-known seasoned issuers, for instance, the Commission and investors must manually read every filing, employ imperfect text-scraping software, or outsource those tasks to a vendor by purchasing a commercially-available database. (Page 2.)

Last November, the SEC signaled a change. In a report on modernizing non-financial disclosure, the agency recognized that adopting standardized data tags for cover page information “could enhance the ability of investors to identify, count, sort, and analyze registrants and disclosures” (page 22). And the report cited the Data Coalition’s comment letter as its inspiration for this (footnote 90).

And last week, the SEC made it official, in a formal rule proposal on Wednesday, October 11. The agency proposed to begin requiring public companies to submit all cover page information as standardized, open data (page 105).

Over the next two months, the SEC will collect public comments on this change from the financial industry, the tech industry, transparency supporters, and other interested parties. Then, it will decide whether to finalize the proposed rule.

For the Future of Corporate Disclosure Information: the Force Awakens

By transforming the basic information on the cover pages into open data, the SEC can unleash powerful analytics to make life easier for investors, markets, and itself.

Software will be able to identify companies of particular sizes, or regulatory characteristics, automatically, from the SEC’s public, freely-available data catalogs. There will no longer be any need to "manually read every filing, employ imperfect text-scraping software, or purchas[e] a commercially-available database."

Of course, the cover pages are only the beginning.

Beyond the financial statements, whose slow transformation began in 2009, and the cover page information, whose transformation is starting right now, the SEC’s corporate disclosure system is full of important information that should be expressed as open data, but isn’t. We recommended a strategy for tackling the entire disclosure system in 2015.

Where to begin? The recommendations of the SEC’s Investor Advisory Committee are a good place to start.

The more information is turned from documents into data, the more analytics companies, like our Coalition members idaciti and Morningstar and Intrinio, can turn that data into actionable insights for investors (and for the SEC itself).

The SEC’s open data transformation has been grindingly slow. But last week’s announcement shows it isn’t dead – and in fact is full of promise. We’ll keep pushing.

Read more
Why Quality Data Should Inform OMB’s Government-wide Management Reform Agenda

July 28, 2017 9:00 AM | Anonymous member (Administrator)

This week, I joined three outside experts to co-author a paper addressing a unique opportunity in the federal government, titled Data Powered Leadership Reform: A Business Case for Federal Operational Improvements Enabled by Quality Data. Senior federal leaders are currently responding to a rare policy opportunity to address persistent structural management challenges in federal agencies.

Responding to the President’s Government-wide Agency Reorganization Plan

This March, the Presidential Executive Order on a Comprehensive Plan for Reorganizing the Executive Branch was issued. The government-wide executive order provides much needed political cover to tackle fundamental challenges and addresses a plan to bring “efficiency, effectiveness, and accountability” to executive agencies. The resulting directive from the Office of Management and Budget (OMB) (Comprehensive Plan for Reforming the Federal Government and Reducing the Federal Civilian Workforce, M-17-22) presents the roadmap for ambitious federal leaders to dramatically alter how business is conducted across the federal government.

In this management order, OMB requires each agency to assess internal business line functions by considering factors like duplication, essentiality, appropriateness of federal ownership (vs. State, local, or private-sector), cost-benefit considerations, efficiency and effectiveness, and customers service goals (see the Table on page 6 of M-17-22). Each agency then owes OMB reorganization plans this fall as part of their FY 2019 budget request. We will start to see the results detailed publicly when the President releases the FY 2019 budget request in February 2019. According to OMB:

The Government-wide Reform Plan will encompass agency-specific reforms, the President’s Management Agenda and Cross-Agency Priority Goals, and other crosscutting reforms. The final reforms included in the Government-wide Reform Plan and the President’s FY 2019 Budget should be reflected in agency strategic plans, human capital operating plans, and IT strategic plan. Agencies will begin implementing some reforms immediately while others will require Congressional action. (see item 7 on page 5 of M-17-22)

If your organization provides products, services, or solutions to the federal government, then you need to be tracking this process. The following graphic breaks down the timeline in detail.
See page 5 of M-17-22.

Focusing on Quality Operational Data is the First Step

Our paper, summarized on Nextgov, highlights a fundamental challenge in leading complex, human-powered bureaucratic systems – inadequate operational or material data. We believe that such considerations need to be a fundamental part of this government-wide reorganization process.

Our paper starts by defining the business case for such reforms, and puts this in context of senior agency officials’ daily workflows. We walk through ten specific management challenges such as structural complexity, management feedback loops, the importance of citizen engagement, and the crucial role of political oversight.

Common to all of these business cases is the issue of poor data; both operational (i.e., mission agnostic data that represent the resources, decisions, transactions, outputs, and outcomes of work) and material (i.e., mission specific data that represents persons, places, and things).

Of course, both operational and material data must also be of high quality to be useful. Which means it must be accurate, consistent, and controlled (see this 2016 White House open data roundtable briefing paper as well as the CIO.gov Open Data Principles).

For example, the DATA Act represents an incredibly valuable government-wide operational data set.

If we recognize how funding is a common factor of every federal program, then we can see how money flows through the federal agencies, accurately, consistently, and comprehensively. Then, we illuminate an accurate picture of how the government functions (more here). This is the true value of the DATA Act.

If we focus on building out accurate, consistent, and controlled data, we can start to fix the structural conditions and help federal leaders champion tangible reforms.

Specific Recommendations for this Administration That Don’t Require Legislation

This Administration is providing the environment to accomplish this. But it will require diligence, ingenuity, and coordinated political willpower to achieve any success.

That is why we encourage the primary reliance on high quality data in government-wide management. It is something leaders can immediately agree on while leverage existing efforts.

Our paper provides the following recommendations:

1.OMB should adopt the DATA Act Information Model Schema (DIAMS) as the primary government-wide operational data format to align various agency business functions. With over 400 unique data elements the DAIMS represents the most comprehensive and unified schema of federal operations in US history. The DAIMS links budget, accounting, procurement, and financial assistance datasets that were previously segmented across agency systems and databases.

OMB should rely on the DAIMS’s open documentation architecture which allows for ready expansion and linkage to other administrative datasets.

OMB’s required Annual Performance and Annual Financial Report processes should be modernized in a machine-readable, DAIMS aligned schema.

In accordance with the DATA Act’s Section 5 vision for a grant reporting modernization and the work completed by the HHS DATA Act Program Management Office pilot project, OMB should create a centralized grant reporting process to extend the DAIMS’s ability to track post-award federal spending.

2.OMB should adopt and seek to codify the governance body of the National Information Exchange Model (NIEM) and encourage the schema’s use as the primary government-wide material data format to facilitate inter-agency and state-local records exchange around shared missions.

The NIEM project, currently administered voluntarily by DHS, manages the expansion of community-based schema governance processes (there are currently fourteen specific domains including human services, justice, emergency management, etc.). In coordination with the data standardization work of GSA’s US Data Federation (an outgrowth of the Data.gov effort) and Project Open Data, NIEM stands poised to foster a base of standardized material data to inform the natural harmonization of common mission data within agency environments.

3.OMB’s initiative to adopt a government-wide Technology Business Model (TBM) taxonomy, to enable standardized federal technology investment data, should be celebrated.

As referenced in the Fiscal Year 2018 budget request, OMB should build upon the DAIMS as they integrate the TBM within the context of the annual Capital Planning and Investment Control (CPIC) process.

The outlined recommendations are just a starting point for how the Administration, Congress, and federal agencies can truly modernize! I strongly encourage all stakeholders to get behind these crucial data initiatives.

Read more
What we learned at the DATA Act Summit – and why it was our last one

July 14, 2017 9:00 AM | Anonymous member (Administrator)

On June 26th, our DATA Act Summit set records: our best-attended event ever (738 registrations), with the highest number of speakers we’ve ever featured (66, including six Members of Congress) and the most exhibitors we’ve ever hosted (twenty-five).

But the really important number is 3.85 trillion – the number of U.S. federal dollars spent in 2016 and now tracked and published as open data. The DATA Act’s main deadline finally arrived last May, every federal agency began reporting spending using the same government-wide data format, and the Treasury Department combined their submissions into a single, unified open data set, for the first time in history.

At this fourth annual DATA Act Summit, we no longer had to point to the future and predict the ways open spending data would benefit government and society. The future had come and the benefits were all around us – a world of new ways to visualize, analyze, and automate information about how taxpayers’ money is used.

But we are never going to do this again.

Last month’s DATA Act Summit, presented by Booz Allen Hamilton, was the final one.

Here’s what we learned, and why we will never again host another DATA Act Summit.

For government management, this new data set is the center of everything.

Who’s using the new data set? SBA CFO Tim Gribben, acting DHS CFO Stacy Marcott, NRC CFO Maureen Wylie, and HHS Deputy Assistant Secretary Sheila Conley, to name a few.

Congress may have passed the DATA Act unanimously out of a desire to deliver transparency to American taxpayers. But the real beneficiaries of the law’s mandate for agencies to standardize and publish their spending information are the agencies themselves.

Under the DATA Act, the Treasury Department created a single data structure, the DATA Act Information Model Schema, or DAIMS, that brings budget actions, account balances, grants, contracts, and loans into a single view. The DAIMS is the first, and only, multi-agency, multi-function data structure in the entire government and currently tracks over 400 unique data elements.

Now that they’ve taken the time and investment to translate their disparate, far-flung spending compilations into the DAIMS, agencies can now visualize and analyze their finances in new ways.

In just one day, we learned that Department of Homeland Security leaders intend to use the new data set to target which areas of the vast agency need the most human capital investment – because, for the first time, they can see salary spending by sub-agency and by account. The Nuclear Regulatory Agency will use the new data set to compile its Congressional budget request. And a Health and Human Services IT executive predicted that her department will be able to immediately understand the full scope of resources devoted to combating a large-scale event, like an epidemic.

For the first time, governmentwide executives at the White House have a single, unified view of the scores of billions of dollars spent on software and systems. “The importance of being able to describe that cost cannot be overstated,” said acting federal Chief Information Officer Margie Graves.

Inspectors general at every agency are revving up their data analytics operations – because the new data set gives them a new source for indicators of waste, fraud, and abuse. So is Congress, said Sen. Rob Portman, chairman of the Senate Permanent Subcommittee on Investigations: “We’ll use this data. We’re happy to have it.”

And more uses are coming! The winners of last April’s DATA Act Hackathon showed how the new data set can be used for evidence-based policymaking, tracking the localized impact of grants, scrutinizing procurement, and groundbreaking analytics.

The more the DAIMS is expanded, the more data is put into this unified view, and the more useful the data set will become. Congress must amend the DATA Act to require the DAIMS to go into more granular detail about grants and contracts – right now, only summaries of each award are part of the structure.

The Treasury Department has shown us the best way to run government-wide projects.

The Treasury Department’s all-female DATA Act implementation team, led by Deputy Assistant Secretary Christina Ho (not pictured), delivered the first-ever government-wide picture of federal spending – on time and under budget.

Presidential administrations as far back as Jefferson's have been demanding a single, “consolidated” view of the federal government’s finances – so that “every member of Congress, and every man of any mind in the Union, should be able to comprehend them, to investigate abuses, and consequently to control them.”

The DATA Act provided a mandate for the creation of this single view, using a government-wide data standard and a requirement for every agency to follow it.

But it fell to a small team at the Treasury Department, led by Deputy Assistant Secretary Christina Ho, to design the DAIMS, educate CFOs’ offices on how to translate disparate spending information into that common standard, and help all of them meet the May 2017 deadline – mostly without any extra funding.

Ms. Ho and her team succeeded beyond expectation. The project “was on time, it was under budget, and it delivered on its promise. Not many government projects can say that,” said GSA Technology Transformation Service commissioner Rob Cook.

How did they do it? The Treasury team, assisted by specialists from the General Services Administration’s 18F tech development group, conducted the first-ever government-wide agile project. Instead of designing the DAIMS all at once, Treasury "produce[d] successive versions of the schema that incorporate[d] regular feedback from experts across the various communities.”

Fiscal Assistant Secretary David Lebryk, to whom Ms. Ho reports, compared the DATA Act project favorably to Treasury’s 2013 roll-out of the Governmentwide Treasury Account Symbol (GTAS) account reporting system. “We were able to do something in six months that took us four years using a traditional design process—at a fraction of the cost,” Lebryk said.

Next to be transformed? Grantee reporting.

This is the future of federal grantee reporting.

The first version of the DATA Act introduced in Congress in 2011 was bolder than what finally became law in 2014.

The original bill would not just have standardized federal agencies’ spending information. It would have transformed the whole ecosystem of federal grantee reporting, too. The 2011 proposal would have set up a governmentwide data structure to modernize reporting by grant recipients.

The final law stepped back from this vision – and, instead, set up a pilot program to test whether standardized data might help grantees reduce their compliance burden. The pilot program, conducted by the Department of Health and Human Services, ended last May, and the White House Office of Management and Budget is going to issue a report to Congress next month to say whether data standardization is a good idea.

At the DATA Act Summit, three panels of experts on grantee and nonprofit compliance told our audience that the grantee reporting ecosystem needs governmentwide data standards.

Kerry Neal, deputy director of the Environmental Protection Agency’s grants office, shared a vision of “seamless integration” from grant application, to award, to disbursement, to performance reporting. Today, federal grantees are subject to a hailstorm of duplicative reporting requirements, each involving expensive manual compliance processes. If the government adopted a common data structure for all those reports, software could automate this burden.

Grantee reporting is the next frontier of data standardization – and the discussions at the Summit laid the foundation we’ll need to get it done.

The DATA Act Summit will never happen again.

Booz Allen Hamilton Vice President Bryce Pippert, lead sponsor, closes the Summit.

So why end a good thing?

Because the DATA Act is done. Thanks to the hard work of advocates in Congress and visionaries in the executive branch, standardized and open data is now the centerpiece of the federal government’s financial management.

The work of data standardization is not done. The DAIMS must be expanded to include more of the government’s management information, beyond the basic spending operations it currently covers so well.

As we campaign for future legislative reforms to expand the DAIMS, our programming will expand, too. Expect events focusing on spending and performance data, spending and grant reporting data, spending and oversight data.

At the Data Coalition, we've got to keep on moving! Thank you for joining us on this journey.

Read more
This Data Set Took Six Years to Create. Worth Every Moment.

May 09, 2017 9:00 AM | Anonymous member (Administrator)

Today, for the first time in history, the U.S. federal government’s spending information is one single, unified data set.

Under a deadline set by the DATA Act of 2014, today every federal agency must begin reporting spending to the Treasury Department using a common data format. And Treasury has published it all online, in one piece, offering a single electronic view of the world’s largest organization.

Until today, different types of federal spending information were all tracked in different ways and reported to different places. Agencies reported their account balances to Treasury, budget actions to the White House, contracts to GSA, and grants to the Awards Data System.

But today, these agencies are reporting all of this information to a new database at Treasury, and Treasury is reporting it to you.

Until today, if you wanted to view the federal government’s account balances, you would have to file a Freedom of Information Act request with every agency. Even if you did that, you wouldn’t be able to figure out which grants and contracts were paid from which accounts.

But today, every agency is linking its accounts, budget actions, grants, and contracts together, showing which grants and contracts are paid from where. Here's an interactive picture of it all. And here's the data set, ready to download. Try it!

Why does this matter?

In 1804, President Thomas Jefferson wrote to his Treasury secretary, Albert Gallatin, that the government’s finances had become too complex for Congress to understand – allowing spending and debt to rise out of control.

Jefferson hoped that the scattered “scraps & fragments” of Treasury accounts could be brought into “one consolidated mass,” easier to understand, so that Congress and the people could “comprehend them … investigate abuses, and consequently … control them.”

Jefferson’s goal was not fully realized, not until today.

This is what Thomas Jefferson told his Treasury Secretary to create.

Congress and the White House continued to track spending by appropriation and budget, while federal agencies developed their own complex accounting methods. In 1990, federal agencies began publishing regular financial statements, summarizing all their accounts, but not providing detail. In 2006, then-Senator Barack Obama and Senator Tom Coburn passed a law to publish, online, a summary of every federal grant and contract.

Even after the reforms of 1990 and 2006, these records of accounts, budgets, grants, and contracts all remained segregated from one another, and could not be connected into “one consolidated mass” – not until today.

Today’s data set brings all that information together in one piece, and links it. We can see how budget actions, account balances, and grant and contract awards all relate to each other.

Starting today, we can finally run data analytics across the whole government, all agencies, to illuminate waste and fraud. (In Washington, federal leaders got a first taste of this at the first-ever DATA Act hackathon, two weeks ago.)

Starting today, we can track the economic impact of Congress’ spending decisions, because we can finally match laws Congress passes to the grants and contracts that are awarded under those laws.

Starting today, the federal government can operate as one enterprise, the way private-sector companies do, because its dozens of agencies’ thousands of financial systems are all speaking the same language.

Last month, former Microsoft CEO Steve Ballmer announced that he had invested $10 million and years of effort into USAFacts.org, a new attempt to create one picture of government spending. Ballmer’s team had to combine – manually – budget information from the White House, financial statements from the Federal Reserve, and state and local sources. USAFacts.org didn’t even try to integrate grant and contract details; there was no way to link them.

If Ballmer had just waited a month, they would have found much of their work – at least the federal part – already done, in the new data set.

The data set isn’t perfect (much more on that later), but it really is “one consolidated mass.”

How did this happen?

Six years of legislating, lobbying, courage, coding, and cajoling – that’s how.

First came the legislating. In June 2011, Congressman Darrell Issa and Senator Mark Warner introduced the DATA Act. Their goal? “Standardizing the way this information is reported, and then centralizing the way it’s publicly disclosed,” said Warner.

Issa and Warner were right: data standards were, and are, the key to transforming the chaos of federal spending into “one consolidated mass.” If federal agencies all used the same data format to report their different kinds of spending information, then it could all be brought into one picture.

But the data format didn’t exist. Issa and Warner proposed to require the executive branch to create one.

The DATA Act earned early support in the House, where Issa chaired the Oversight Committee, but went nowhere in the Senate. Data standardization was not the first issue on most Senators’ minds.

Then came the lobbying. In 2012, I resigned from Rep. Issa’s Oversight Committee staff to start what was then called the Data Transparency Coalition, the first, and still only, open data trade association. Our first mission: rally tech companies to support the DATA Act.

Tech companies have plenty of self-interest to support reforms like the DATA Act. As the government starts publishing its information in standardized formats, analytics software gets a lot more valuable.

Still, the Coalition didn’t grow very fast. The payoff for our efforts – a unified data set covering all federal spending – was years in the future (today!), and so were most of the business opportunities. Our member companies were signing up to support a long-term vision, which isn’t a natural use for marketing budgets.

We hosted our first DATA Act Demo Day, then our second. Sarah Joy Hays came on board and pulled off a spectacular first-ever open data trade show, Data Transparency 2013, with credentials and keynotes and exhibit booths and everything – then four more.

Thanks to Warner’s persistence, support from the Sunlight Foundation and civil society, and our new tech-industry push, things began to happen in the Senate. Sen. Rob Portman signed on as a cosponsor and the crucial Homeland Security and Governmental Affairs Committee started to get interested in data standardization.

But courage would be required, especially Warner’s.

Behind the scenes, the Obama White House did its best to sink the bill. This was surprising. President Obama was a strong public supporter of open data in government. His Open Data Policy directed all federal agencies to standardize and publish all their information as open data.

But his White House Office of Management and Budget wasn’t on board. OMB didn’t want the challenge of standardizing all spending information, nor did OMB want anyone else to do the job. OMB recommended changes to the DATA Act that used nice words but would have gutted its mandate.

But Warner stood up to the White House. He rejected the proposed changes and kept the bill strong.

A few months later, both chambers of Congress unanimously passed the DATA Act. And on May 9, 2014, three years ago today, President Obama signed it into law, very quietly.

With the law on the books, a coding countdown began. The Treasury Department had one year to come up with a common data format for government spending information – the chaotic, fractured financial, grant, and contract details spread across thousands of systems that had never before been coordinated.

Treasury also had to figure out how, exactly, agencies would deliver their data using that common format. Nobody had ever before created a system like what was needed.

Most government management laws die like this: Congress passes a law and issues some celebratory press releases. The White House, or GSA, or Treasury sets up committees and procedures to do the work. But the work turns out to be hard and complicated, and nobody in the administration really wants to do it – they’re acting because Congress told them to. As soon as Congress’ attention moves on to other topics, the bureaucrats write reports pretending the work has been done. Or, better yet, the project is combined with another one, it changes ownership several times, and the law’s original goals are gradually forgotten.

The DATA Act avoided this fate – largely because of one person.

At Treasury, Deputy Assistant Secretary Christina Ho had already been trying to standardize spending data. (Christina was the first to find the Jefferson letter I quoted earlier, in fact.)

Once the DATA Act became law, she was put in charge of implementing it, and she made up her mind that this time would be different.

Christina assembled a team that shared her ambition and understood why we needed a unified data set covering all spending. They got to work.

Christina’s team created the data format: the DATA Act Information Model Schema, or DAIMS, which defines the common data fields of federal spending and shows how they related to one another.

They did this work in the open, in public, using the GitHub coding platform to take suggestions from the whole world and show their choices. Nothing like this had been done in government before.

They announced the DAIMS on May 8, 2015, one day before the deadline. That triggered a second countdown: all agencies had to report spending data by May 9, 2017.

And to help agencies deliver their information, Christina recruited the 18F technology development center at the General Services Administration. 18F built the DATA Act Broker, a piece of open-source software that collects and validates spending data from every agency. They built it using Agile methodology, with constant testing and revision.

Here is the code of the DATA Act Broker; download it if you want.

Nothing like this had been done in government before either.

But coding wasn’t enough. The DATA Act’s supporters outside the government, and Christina’s team inside, had to do a great deal of cajoling.

Even with the DAIMS providing a standard structure for all government spending information, and a DATA Act Broker easing the process, the law didn’t really have teeth.

There were no penalties for agencies that don’t report standardized spending data. And OMB made it clear that the Obama administration didn’t really care if they did, or didn’t.

OMB couldn’t, or wouldn’t, create a list of the agencies required to comply. OMB tried to claim that most of the DAIMS wasn’t really required by the law – in order to shut it down later. OMB insisted on a weaker DAIMS than Treasury wanted, in which financial information comes right from source systems, but grant and contract information doesn’t.

With a lack of leadership from the White House, we had to push agencies toward compliance in other ways.

First, a few agencies started to realize that standardizing their spending information would make their own work easier, and so we celebrated them at our events.

The Small Business Administration was the first, and best. Chief Financial Officer Tim Gribben used the DAIMS to visualize which SBA grants were being paid from which of its accounts, and plot them on a map. This would have required a bunch of data calls before the DATA Act. Now, it was automatic.

In 2015, over 600 people participated in our DATA Act Summit and saw demonstrations of what leaders like Tim were doing. Ditto in 2016.

Second, Congressional committees stayed involved, instead of moving on. The House Oversight Committee held four hearings focusing on the DATA Act. Behind the scenes, we stayed in touch with committee staff and Members, delivering intelligence and describing the law’s long-term vision.

Every year, we brought tech companies to Capitol Hill to remind Congress why the DATA Act was important.

Members of Congress publicly rebuked OMB for slow-walking the DATA Act, and told the agencies they’d celebrate compliance.

Rep. Mark Meadows even did his own DATA Act software demonstration – on our stage. Members of Congress don’t usually do demos.

Third, we worked to spread the word about the DATA Act’s benefits to the people who’d have to do the work – especially federal financial management professionals, who’d have to report the data, and inspectors general, who’d have to audit it.

In 2016 we founded the Data Foundation, a new nonprofit research organization. Its first piece of research, The DATA Act: Vision & Value, which we co-published with MorganFranklin Consulting, told federal agencies why the DATA Act mattered.

The cajoling worked. Not every agency is going to make today’s deadline, but almost all of them will – and even the worst ones are submitting partial reports.

And we’ll keep cajoling until all reports are in.

What comes next?

The data set is live. Now, it sure had better get some use! If the data set is used for antifraud analytics, internal management, and public transparency, especially by the federal agencies themselves, its quality will get better and better.

At next month’s fourth annual DATA Act Summit, we’ll highlight the agencies, tech companies, and coders who are doing the most amazing things with this new resource. We’ll celebrate the winners of last month’s DATA Act hackathon too.

We’re not out of the woods yet.

Last week, the Data Foundation’s new report with Deloitte, DATA Act 2022, described the six main challenges to the DATA Act’s success. We need to spend the next five years dealing with those.

What are the challenges? The most serious is that DATA Act reporting is running alongside old-fashioned, non-standardized reporting. Agencies still have to report the same information using documents and non-standardized legacy databases like the FPDS, even as they comply with the new DATA Act mandate.

As long as that happens, there’s a danger that agencies will see the legacy databases as the main system, and the DATA Act as an add-on.

Congress needs to kick the stool out from under this duplication, and direct the government to make the DATA Act the main, and eventually the only, way that spending is reported. DATA Act 2022 explains how.

The second-most-serious is that the government continues to use the DUNS Number to identify grantees and contractors. The DUNS Number is owned by Dun & Bradstreet. Dun & Bradstreet has a monopoly, protected and profitable, on spending data. Until that monopoly is broken, the private sector won’t be able to take full advantage of the data set.

Passing the DATA Act and getting agencies’ spending data took six years. Fully realizing its vision will take many years more.

But every moment has been worth it. Every moment will be worth it. A unified federal spending data set makes our democracy better, in so many ways.

Today, we thank the Data Coalition’s members and Data Foundation’s supporters, without whom none of our work would have been possible.

And today, we celebrate Darrell Issa, Mark Warner, Christina Ho, Tim Gribben, and all the other leaders who caught Jefferson’s dream of a single, unified federal spending data set, and didn’t let go.

Read more
Coalition to SEC: Replace Corporate Disclosure Documents with Open Data

July 22, 2016 9:00 AM | Anonymous member (Administrator)

In April, the Securities and Exchange Commission published a 341-page Concept Release exploring the future of corporate disclosure in the United States. Yesterday the Data Coalition responded.

Yesterday’s comment letter calls for the SEC to completely replace its current system of old-fashioned, plain-text disclosure documents with open data.

This is the Coalition’s third major appeal for the SEC to transform its disclosure system. We provided a detailed comment on the agency’s strategic plan in March 2014 and submitted a full road map for open data transformation to the agency’s Division of Corporation Finance in October 2015.

Last April’s Concept Release shows that we’re making progress! The Concept Release says (at page 249) that the SEC is thinking about transforming public companies’ subsidiaries disclosure – one of the documents that needs open data most desperately – into open data fields, just as we’ve recommended. It asks for suggestions (at page 255) about how to start using the global Legal Entity Identifier (LEI) to match public companies’ SEC filings with their filings to other agencies, just as we’ve recommended. And a whole section (section V.G) of the Concept Release asks how to fix the agency’s embattled open data reporting program for corporate financial statements, just as we’ve … you get the idea.

In fact, the Concept Release even cites our previous comment letter several times.

On the other hand, the Concept Release still assumes that the future of corporate disclosure will be based on documents, not on data.

The Concept Release invests a lot of energy (questions 286-306!) asking for suggestions about how to use cross-referencing, incorporation by reference, and hyperlinks in corporate disclosures, to save investors from having to read lengthy documents. Open data makes these techniques unnecessary! If all corporate disclosure information were published as open data, then companies like idaciti and Bloomberg could deliver that information to investors at whatever level of detail they want. With open data, the SEC won’t have to worry any more about snowing investors with too much detail.

The Concept release also asks (starting at page 318) if the SEC should prescribe new graphic layouts for corporate disclosures. In an open data world, there’s no need for the government to prescribe how corporate information needs to look on a page.

To make progress on corporate disclosure, the SEC needs to question its assumption that document disclosure is the wave of the future. Our comment letter says so, with lots of footnotes.

Read the Coalition’s full comment here.

Read more
Automatic Redlining for Legislation? Rep. Elise Stefanik Wants to Make it Happen

July 14, 2016 9:00 AM | Anonymous member (Administrator)

In real life, document redlining is normal. If you’re a student or a knowledge worker, you probably use redlines in Microsoft Word or other tools to track changes and compare drafts.

But in Congress, document redlining is not normal.

On their way to death or passage – usually death – pieces of legislation are usually amended many times. You can track a bill’s progress on Congress.gov. But you can’t see how each version changed from the last one.

To redline a bill from its last version, you have to copy-paste both versions into Microsoft Word and run a comparison yourself. And that’s tricky, because page numbers and preambles and formatting don’t line up.

But Rep. Elise Stefanik (R-NY) wants to change that. How? Open data is how.

Rep. Stefanik, joined by Luke Messer (D-IN) introduced the Establishing Digital Interactive Transparency Act (EDIT Act) (H.R. 5493) on June 14th, 2016. The bill is currently pending in the House of Representatives’ Committee on House Administration. When this bill is signed into law, the Library of Congress would be charged with implementation and would have one year to comply.

What will it do?

The main body of the EDIT Act is one short, very sweet sentence:

“In the operation of the Congress.gov website, the Librarian of Congress shall ensure that each version of a bill or resolution which is made available for viewing on the website is presented in a manner which permits the viewer to follow and track online, within the same document, any changes made from previous versions of the bill or resolution.”

Translation: everybody gets a redline!

Imagine having the ability to track and monitor changes to a bill from its initial conception, through committee markup, to the House of Representatives and Senate floor for amendments and voting, all the way to the President’s desk. This would create a truly transparent legislative process. But we aren’t there yet.

How do we get to automatic redlining?

There’s only one way to do this: open data.

Today the House and Senate use documents, not data, to track their magnificent paper trails. In order to create the sort of redline Rep. Stefanik wants, the House and Senate would have to adopt a standardized, open data format for all bills. Bills would have to be drafted, handled, and amended in that format.

The technology already exists.
A screen shot of Xcential’s redlining software.

Two years ago, the Clerk of the House and the Office of the Law Revision Counsel set out to create and apply an open data format in the U.S. House Modernization project. Data Coalition member company Xcential is helping to run it. Xcential’s software can natively draft and amend bills using the XML-based U.S. Legislative Model.

But there was no push for the Senate to embrace this project as well. Not, that is, until Rep. Stefanik came along. By requiring all bills to be redlinable on Congress.gov, the EDIT Act is also pushing the modernization effort that is already underway.

The EDIT Act is one piece in the broader effort to move the U.S. federal government technological capabilities into the 21st Century. The Digital Accountability and Transparency Act of 2014 (DATA Act) (PL 113-101), the nation’s first open data law, mandates all U.S. federal spending information must be be standardized and published in an open, machine-readable format. Just as the DATA Act creates a common format for spending, the EDIT Act requires a common format for legislative materials.

The Statutes at Large Modernization Act (SALMA – HR 4006) is another relevant bipartisan bill. Introduced by Reps. Dave Brat (VA-7-R) and Seth Moulton (MA-6-D), and currently pending in the House of Representatives. SALMA would require a common data format for the Statutes at Large.

What does this mean down the track for open data?

Imagine if spending data (under the DATA Act) and legislative data (after the EDIT Act and SALMA and other reforms) came together.

If Congressional appropriations committees coded their appropriations bills in a consistent open data format, and connected the same to the spending data format that the DATA Act requires – you’d have an electronic connection between every appropriation and the spending it funds.

You could electronically track the consequences of every Congressional spending decision – from appropriation to Treasury allocation to agency obligation, all the way down to payments to contractors and grantees. This is called the “life cycle” of federal spending.

Such a connection would simplify the appropriations process. It would cut back on the hours staffers spend researching. It would give citizens the ability to hold their member of Congress accountable for all the consequences of every spending decision.

Achieving this whole picture is years away.

The immediate challenge is mandating a common data format for legislative materials so redlining is possible. The EDIT Act is one step in the right direction.

Therefore, Data Coalition strongly supports the passage of the EDIT Act.

Read more
The OPEN Government Data Act: A Sweeping Open Data Mandate for All Federal Information

April 18, 2016 9:00 AM | Anonymous member (Administrator)

UPDATE: The OPEN Government Data Act was formally introduced by Reps. Kilmer and Farenthold and Sens. Sasse and Schatz on April 26, 2016. In a joint statement, the legislators said the bill would improve services in the public sector and support new discoveries in the private sector.

The Data Coalition, which represents the growing open data industry, was pleased to welcome the introduction of the Open, Public, Electronic, and Necessary (OPEN) Government Data Act by Representatives Derek Kilmer (WA-D) and Blake Farenthold (TX–R) last Thurday at a press event co-hosted with the Center for Data Innovation.

Sens. Brian Schatz (D-HI) and Ben Sasse (R-NE) also announced they’ll soon introduce a companion bill in the Senate. The bill text is available on Congress.gov and Rep. Kilmer’s office has published a section-by-section summary.

Simply put, the OPEN Government Data Act makes the President’s 2013 open data policy into law (see Presidential memo M-13-13). It directs all federal agencies to publish their information as machine-readable data, using searchable, open formats. It requires every agency to maintain a centralized Enterprise Data Inventory that lists all data sets, and also mandates a centralized inventory for the whole government – codifying the platform currently known as data.gov.

The Data Coalition readily endorsed this bill. “When government information is expressed as open data, it can be republished for better transparency outside government, analyzed for better management within government, and automated for cheaper reporting to government,” said Hudson Hollister, executive director of the Data Coalition, in the Coalition’s media statement. “Our Coalition members’ technologies can do all those things, but only if the information is expressed as open data instead of disconnected documents.”

Here are seven things you should know about the OPEN Government Data Act.

1. If you read nothing else, read Section 5!

The core of the OPEN Government Data Act is Section 5, which says:

To the greatest extent practicable, Government data assets made available by an agency shall be published as machine-readable data … To the greatest extent practicable when not otherwise prohibited by law, Government data assets shall … be available in an open format, and … be available under open licenses.

Section 5 makes it the official policy of the U.S. government that all information should be published as open data, using nonproprietary formats. Section 5’s broad mandate is very good news for the Data Coalition’s policy agenda.

Section 5 would make every failure to use open data – the SEC’s clinging to document-based corporate disclosures, Treasury and OMB’s continued use of the proprietary DUNS Number, the IRS’ reluctance to move to e-filing – legally questionable.

If the OPEN Government Data Act becomes law, the federal government will certainly not change its information collection and publication practices overnight, or automatically, to conform. Section 5 is much too broad to accomplish that.

But the Coalition and other open data supporters will be able to use the law in comment letters, hearings, and public advocacy to encourage faster change.

2. Bipartisan support means the OPEN Government Data Act has an excellent chance of becoming law.

The need for open data is an issue that crosses party lines in both the House and the Senate.

According to Rep. Kilmer, the OPEN Government Data Act “will empower the government to be more effective, private sector to innovate, and public to participate.” The breadth of potential uses of valuable public-sector data excited Rep. Farenthold, a former computer and web design consultant. Farenthold told Thursday morning’s audience: “We need to get the full value of what the government has done!”

Sen. Schatz said, “[In Washington] we don’t always agree on everything but this is about accountability and efficiency…we have found our common ground for the good of the people and of the economy.”

For the OPEN Government Data Act to move through Congress, it must next earn consideration by the committees that have jurisdiction: the House Oversight and Government Reform Committee and the Senate Homeland Security and Governmental Affairs Committee. But the bill’s bipartisan footing, and Rep. Farenthold’s and Sen. Sasse’s seats on those two committees – give it an excellent chance.

3. Industry experts and nonprofit advocates say the OPEN Government Data Act is necessary – but not sufficient – to transform government.

At the event, Hollister moderated a panel discussion featuring Kat Duffy, director of Sunlight Labs, Tim Day, vice president of the U.S. Chamber of Commerce’s Center for Advanced Technology and Innovation, Jed Sundwall, head of open data at Amazon, and Joshua New, policy associate at the Center for Data Innovation.

Day pointed out that the OPEN Government Data Act is a “rational policy to create jobs and promote innovation,” as traditional manufacturing companies increasingly see themselves as technology data companies. Duffy described the bill’s central finding that government “information presumptively should be available to the general public” (Sec. 2 (a)2)) as “magic language” that “should be inherent to the fabric of our country.” Sundwall simply remarked that the “availability of open data is kind of like oxygen” for Amazon and other web infrastructure companies.

The OPEN Government Data Act – if it becomes law, and if agencies follow the law – will unlock information that can drive new business opportunities for all sorts of companies, and new advocacy platforms for all sorts of nonprofits.

But while echoing the bill’s opening statement that “[g]overnment data is a valuable national resource” (Sec. 2 (a)(1)), New underscored that there are many potential users who don’t yet understand the potential of open data.

4. While Section 5 is sweeping and broad, Section 7 provides a practical mandate: agencies have to create and maintain inventories of their data assets.

While Section 5 of the OPEN Government Data Act will provide a sweeping mandate that all federal information should be expressed as open data, Section 7 provides a more practical requirement:

[E]ach agency … shall develop and main an enterprise data inventory … that accounts for any data asset created, collected, under the control of, or maintained by the agency.

President Obama’s 2013 Open Data Policy already requires agencies to do this. Section 7 gives that requirement the force of law and extends it beyond the Obama administration.

Agencies’ data assets are as diverse and extensive as the work of the federal government itself. Sundwall, commenting on the exercise of opening up government data sets: “it can seem kind of endless.” As he pointed out, agencies may not even fully understand what data they have.

But the process of publicly accounting for data assets creates a dialogue between different agencies, and between the government and the public. The Sunlight Foundation’s Duffy said, “[I]t is very useful to see the world of information in one agency over another.” Data inventories allow the public to zero in on areas of value, encourage agencies to clean up messy databases, and ultimately reduce inter-agency duplication.

By making the 2013 inventory process permanent, said the Center for Data Innovation’s New, the OPEN Government Data Act helps ensure that public data “will be available forever,” and provides a way to authenticate data.

5. The legislation fits with the DATA Act and Financial Transparency Act – and may reinforce them.

The Data Coalition generally avoids generalities (repetition intended!) in its policy campaign. We advocate domain-specific policy changes that deliver immediate value to government, society, and our members.

We also avoid buzzwords!

The Coalition successfully advocated for the passage of the Digital Accountability and Transparency Act (DATA Act) of 2014, the nation’s first open data law. The DATA Act requires the federal government to express spending information as open data.

The Coalition is currently supporting the Financial Transparency Act (H.R. 2477), which would transform financial regulatory information into open data.

The OPEN Government Data Act is different. Rather than focusing on a particular domain, like spending or financial regulation, this bill applies to all the information the federal government collects and generates.

So, for the Data Coalition, what’s the value of supporting such a broad mandate? Because it can reinforce the specific ones.

If the OPEN Government Data Act becomes law, we’ll be able to point to it to resolve questions about interpreting the DATA Act or the Financial Transparency Act.

For example, earlier this year, a GAO report revealed that the White House is trying to reinterpret the DATA Act to be much narrower than Congress intended. If the OPEN Government Data Act were in effect, it could help prevent such narrow reinterpretations from ever being offered in the first place.

The OPEN Government Data Act also will provide a pathway to advocate open data transformations in areas where there’s not yet a specific law – like the IRS’ nonprofit tax returns or the Justice Department’s foreign agent registration forms.

6. This is an anti-DUNS law – because it bans proprietary standards.

The federal government is starting to move away from its dependency on the proprietary Digital Universal Numbering System (DUNS) number to identify grantees and contractors. Because the DUNS Number is proprietary – itself owned by a contractor, Dun & Bradstreet – nobody can use spending information without purchasing a license. That means, even after the DATA Act, spending data won’t yet be fully open.

Section 5 of the OPEN Government Data Act scores a direct hit on proprietary data standards like the DUNS Number. Section 5 requires all agencies to publish their information “open formats,” which are defined in a way that excludes the DUNS Number, and to publish information under “open license,” which means the information has to be available at no cost.

Section 5 says agencies only have to do this “to the extent practicable,” and there’s surely a good argument that switching away from the DUNS Number all at once wouldn’t be practicable.

But as the government continues to evaluate the best way to track hundreds of thousands of grantees and contractors, the OPEN Government Data Act will offer one more reason why it should reject the DUNS Number and adopt an identification code that is freely available.

As the Center for Data Innovation’s New pointed out on Thursday, the DUNS Number makes it harder than it needs to be for agencies and private-sector watchdogs to track spending. The OPEN Government Data Act will help break the dependency.

7. Coalition members can use government information to deliver better transparency, better management, and automated compliance – but only if the information is expressed as open data. The OPEN Government Data Act will be a powerful catalyst for business opportunities.

The Data Coalition sees consistently-structured, consistently-available government data as the key to a multitude of federal management, financial oversight, and regulatory challenges.

Our members’ technologies can republish government information for better transparency, analyze it to enable better management, and automate reports to reduce compliance burdens. But all these technologies only work on open data. They don’t work on documents.

For our members to pursue their business models, and deliver benefits like transparency, better management, and automation, the government has to decide to adopt standardized formats and consistently publish its information in the first place.

The OPEN Government Data Act, by offering a legal mandate for data standardization and publication, will hasten the transformation from disconnected documents to open data.

As Sunlight’s Duffy put it in her closing remarks, “The idea [is] that this is the public’s data….it would be so nice to no longer be debating the virtues of open data, but to instead be talking about ways to do that.”

In coming months, the Data Coalition will be working to encourage legislators to co-sponsor and the committees of jurisdiction to consider, and pass, this bill.

Read more
An Open Data Menu for Arkansas

April 04, 2016 9:00 AM | Anonymous member (Administrator)

Last Thursday the Data Coalition was honored to meet with the Arkansas Open Data and Transparency Task Force. The Task Force is a one-of-a-kind body, established by law, that includes state legislators, representatives of the Governor and Attorney General, and the leadership of most of the state’s largest agencies. Its job: by the end of this year, recommend an open data law to the Arkansas legislature.

The Task Force invited the Data Coalition to present a “menu” of open data options. With such an open-ended mandate, where should the Task Force focus its reform recommendation? What are the benefits of standardizing and publishing public-sector spending information? What if the Task Force focuses on regulatory reporting instead? How does health care fit in? We recruited government leaders and Coalition members to explain the benefits other states, the federal government, and foreign countries have derived from open data in each substantive area.

We told the Task Force that open data isn’t just good for public accountability. It also powers analytics for internal management and – done right – reduces compliance costs by automating reporting tasks that used to be manual.

Here are the presentations we shared with the Task Force on Thursday – and what we learned ourselves.

Open Data in Spending: Deliver Transparency, Stop Fraud

The federal government, and many states and localities, are adopting consistent data standards for information about their finances, transactions, grants, and contracts – and publishing that information as open data. When spending information is expressed as open data instead of old-fashioned documents, citizens can use it to hold politicians accountable; managers and inspectors general can visualize and analyze spending more easily; and required reports by grantees and contractors can be automated.

The federal DATA Act, whose passage our Coalition celebrated in 2014, requires the executive branch to transform all federal spending information into open data. The U.S. Treasury Department and the White House Office of Management and Budget are hard at work creating a government-wide data structure to bring the federal government’s disparate types of spending information together as one searchable data set. Federal agencies must report their spending using these standards by May 2017. Meanwhile, the law requires OMB to run a pilot program that determines whether the same standards can help federal grantees and contractors automate their reports and comply more cheaply.

But many state and local governments are far ahead. Seth Unger, senior policy advisor to Ohio Treasurer Josh Mandel, showed the Arkansas task force that Ohio’s Online Checkbook allows citizens to navigate through all spending by the Ohio government and its agencies – from statewide categories all the way down to each individual transaction. By publishing spending information as standardized, open data, the Online Checkbook improved the state’s grade in the Public Interest Research Group's ranking of state spending transparency efforts from D minus to the first-ever A plus.

Ohio’s Online Checkbook allows Ohioans to navigate from broad trends to individual transactions – not just for state agencies, but now for many municipal governments too.

Mr. Unger told the task force that Ohio’s Online Checkbook has allowed users – both inside and outside the state government – to quickly notice “outliers” and strange patterns and take corrective action.

(Ohio had an easier challenge than some other governments because it had already standardized its spending information. Ohio adopted a single, consolidated statewide financial system in 2007 – a far cry from the thousands of incompatible financial systems in use by federal agencies. But even with a single financial system in place, Treasurer Mandel and his team had to improve data standardization to make sure the data set on Ohio’s Online Checkbook was meaningful and accurate.)

Treasurer Mandel has taken Ohio spending transparency to the next level by inviting Ohio's local governments to voluntarily publish their spending as open data on Ohio’s Online Checkbook. Nearly 100 cities and towns, 75 townships, 15 counties, 70 school districts, and 15 other entities have done so.

Even in advance of the DATA Act’s mandate, many federal agencies are standardizing internal data sets to empower analytics to suss out fraud. Dave Williams, former inspector general of the U.S. Postal Service, told the task force that he deployed multiple analytic tools against the Postal Service’s spending data. RADAR assigns a fraud risk score to each of the Postal Service’s hundreds of thousands of contracts, allowing investigators to focus on those most likely to involve fraud. TripWire sends an alert when specific situations – such as a proportionally large contract modification or a sudden increase in an employee’s health claims – occur.

“When information is standardized [as open data],” said Mr. Williams, analytics “will be immediate.” But without data standards, every analytics project requires expensive, purpose-built translations.

Open Data in Regulatory Reporting: Cheaper for Business, Better for Agencies

When regulatory agencies choose to collect reports using standardized, open formats, instead of unstructured documents, the information becomes easier for the agencies to analyze – and can be reported in an automated fashion.

Pramodh Vittal of DataTracks told the task force that the UK tax authority, HMRC, simplified reporting for millions of companies by adopting the inline XBRL (“iXBRL”) open data format. This format is both human-readable and machine-readable, which means it can fulfill document-based reporting requirements while also empowering analytics software. Mr. Vittal's presentation showed how software helps companies easily report their financial information in iXBRL.

(HMRC is working side-by-side with Companies House, which regulates all companies in much the same way as most U.S. states’ secretaries of state, to collect tax filings in the same format as Companies House’ corporate filings. This 2012 video from Companies House explains how it works.)

A similar story comes from the Washington state Utilities and Transportation Commission (UTC), which is running a pilot project to adopt an open data format for the financial information that utilities must submit. Although the Washington UTC could not join our presentation on Thursday, an eight-minute video produced by Coalition member RR Donnelley explains how the open data format improves accuracy and reduces costs.

Open data is coming to federal regulatory reporting on a grand scale. Last year, open data supporters in Congress introduced the Financial Transparency Act (H.R. 2477), which will require all eight major federal financial regulatory agencies to adopt consistent data standards for all the information they collect from public companies, banks, and financial firms. Open data in financial regulatory reporting will bring transparency to investors, enable analytics to anticipate Enron-style frauds, and allow regulated companies to automate compliance.

The federal Financial Transparency Act (H.R. 2477) will require all eight major federal financial regulatory agencies to adopt consistent data standards for the information they collect – transforming financial regulation from disconnected documents into open data.
Last week the Data Coalition’s Financial Data Summit brought together over 300 supporters of this transformation. Financial Data Summit speeches, presentations, and media coverage are all available on the Coalition’s event site.

Health Care, Parks and Recreation, and Anywhere Else: Standardizing and Publishing Makes Life Easier

Mary Kay McDaniel of Cognosante told the task force that data standards for health care information can drive new functionality, like price comparisons and quality ratings. Her presentation introduced the main standards in use today, explored the reasons why data standardization is challenging in health care, and offered examples of standards-driven apps. She encouraged the task force to send a representative to national health care data gatherings like the Health Datapalooza.

Although campsite-reservation startup Hipcamp could not send a representative, we explained to the task force that if the Arkansas Department of Parks and Tourism were to publish campsite availability as standardized data, Hipcamp and other startups could use it to enable consumers to make electronic reservations. The national park system and California’s state parks currently offer standardized campsite availability data. But Hipcamp is still asking all other states to take that same step.

Hipcamp’s platform allows customers to view campsites in their area. If parks departments published campsite availability as open data, then Hipcamp’s customers could make reservations on the spot.
What do health care information and campsite availability have in common? Both become more useful to consumers, professionals, and public servants if standardized and published. In fact, as we repeated to the task force, standardization and publication solve problems in pretty much every area of government data.

Open Data Across Departments, Divisions, and Disciplines

The final phase of our presentation to the task force introduced four open data platforms that aren’t limited to a specific type of information. Tableau, Socrata, Esri, and OpenGov all allow users to organize and visualize information and make connections across multiple data sets – connections that would be impossible if those data sets weren’t standardized and published.

Open data reforms must be purpose-built for complex areas of government information, like spending, regulatory reporting, health care, and recreation. For instance, the federal DATA Act had to call out specific reports and systems and provide a strong mandate for data standards within spending.

But Tableau, Socrata, Esri, and OpenGov all showed the power of data sets sourced from different areas of government information, combined using common standards, and then deployed for the public good.

Anthony Young of Tableau used the Florida Department of Transportation’s auto crash dashboard to show how standardized data can deliver unprecedented insights for government leaders. Tableau’s platform allows Florida policymakers to categorize, visualize, and understand auto accidents statewide and by county. The data standards aren’t perfect: for example, when the department changed its reporting methodology for distracted driving in January 2011, trendlines jumped. But the standards are sufficient to reveal insights that could never come from old-fashioned document-based reports.

Chris Rodriguez of Socrata used the state of Iowa as a use case for connecting data sets from different disciplines. Iowa's original open data site required users to know what agency published the data they were looking for. But the new data.Iowa.gov portal allows users to run Google-style keyword searches or browse by topic area. The portal can lead users to interactive open data off all sorts, from unemployment insurance statistics to granular spending information. And it connects automatically to the federal open data portal, Data.gov – which means Iowa’s data sets can be combined with other states’.

Mr. Rodriguez shared a surprising observation from Iowa: most users of the state’s data portal are state employees working inside government – not citizens following their government. These state employees are finding data.Iowa.gov a more versatile than the internal systems from which the data sets are sourced. Socrata’s observation matches one of the most popular goals of the open data movement: governments should "eat their own dog food," basing management decisions on the same information they publish as open data.

Matt Bullock of Esri introduced the state of Michigan's open data portal, which Esri maintains. Esri’s specialty is geospatial information. Since the “vast majority of [government] data has a geospatial component,” Michigan’s portal offers many examples of combining different data sets on one map. Michigan’s portal also allows embeddability: every data set includes electronic codes that allow it to be featured in new websites, automatically linking back to the original data set for automatic updating whenever the source changes.

Mike Dougherty of OpenGov explored the open data network effect: when multiple governments publish their information using the same standards, comparisons among them become possible. OpenGov’s 900 clients, including several Arkansas towns, have all embraced the same data structure for accounting, transactional, and performance data – which means their budgets and finances can all be compared against their peers’.

OpenGov’s client list also includes Treasurer Mandel’s success story in Ohio.

What’s Next for Arkansas?

The Arkansas Open Data and Transparency Task Force is going to deliver a report to Governor Asa Hutchinson and both chambers of the state legislature by December 31, 2016. The report will recommend legislative language for an open data law.

Open data policy reforms are a challenge because the benefits of open data are as broad as government information itself:

Open data can deliver better accountability, as Ohio’s experiences show. Ohio’s online checkbook makes outliers more visible than they ever could have been if trapped in old-fashioned document-based records.

Open data can deliver better management. Iowa’s state employees are the most enthusiastic users of the state’s open data portal – even though they presumably also have access to the internal systems from which the portal sources its data sets. Clearly, when information is standardized and published, it becomes more useful to managers within government as well as citizens outside government.

Open data can deliver automated compliance. UK companies can report to Companies House and HMRC simultaneously because those two agencies adopted the same open data format for financial information. That’s cheaper than filing two separate reports.

The Open Data and Transparency Task Force’s job is challenging – not because it’s difficult to derive benefits from open data but because it’s so easy! There are so many possibilities for better accountability, better management, and automated compliance throughout the state that nine more months won’t be enough time to explore them all. There is too much on the menu.

But we hope Thursday’s presentations provided a good start.

Thanks for your hospitality, Little Rock! See you next time!

Read more
New Proposal: Transform U.S. Statutes at Large into Open Data

November 18, 2015 9:00 AM | Anonymous member (Administrator)

You’ve probably never heard of the U.S. Statutes at Large, which yesterday became the centerpiece of Congress’ latest move toward open data in law and regulation.

This "permanent collection of laws and resolutions enacted during each session of Congress" isn’t as well known as the U.S. Code. While the Code organizes laws by subject matter (a process called “codification”), the Statutes at Large lists them sequentially, the way they were originally passed by Congress.

Two years ago, Congress started publishing the U.S. Code as open data, using a standardized XML structure called the U.S. Legislative Model (USLM). That means software can understand the structure of the Code’s codified laws, connect citations electronically, and (eventually) redline proposed changes automatically.

How does it work? The USLM uses standardized electronic data elements to specify titles, sections, and paragraphs; identify citations; pinpoint dates of enactment and effectiveness; and express other information that previously had to be manually understood by humans reading the text.

The open data transformation hasn’t made its way to the Statutes at Large yet. But yesterday, Reps. Dave Brat (R-VA) and Seth Moulton (D-MA) proposed a bill to change that.

The Statutes at Large Modernization Act (H.R. 4006) directs the National Archives and Records Administration to publish an official open-data version of the Statutes at Large. The bill doesn’t specify what format should be used, but does direct the Archives to consult with all the other entities in the federal government that are involved in the legislative process, including the Congressional office that is already handling the U.S. Code. As long as the Archives does its homework, it should select the USLM – because that will make the Statutes at Large and the U.S. Code interoperable with one another.

(The bill names seven different offices and agencies, in addition to the Archives, that draft, compile, and publish laws. That helps explain why the government has modernized our laws so slowly.)

You’ll find the text of the Statutes at Large Modernization Act here, ironically in PDF. An XML version of the bill will be ready in a few days. (Our Coalition hopes someday Congress will natively draft its legislation in XML to begin with, instead of creating plain-text documents that are later turned into data.)

The Data Transparency Coalition endorsed the Statutes at Large Modernization Act because it takes a giant step toward our ultimate goal: all U.S. legislative and regulatory materials should be open data, instead of documents.

We hope Congress acts quickly on Reps. Brat and Moulton’s proposal. But Congress shouldn’t stop there: all the materials upstream of the Code and the Statutes at Large need to be transformed into open data too, using the same common format, the USLM. The clear next step will be for Congressional bills, resolutions, and amendments to be drafted and published as open data.

Read more