NOTE: Due to the government shutdown, OMB has extended the previous Nov 1,2013 deadlines in this document to Nov 30, 2013
The purpose of this guidance is to provide additional clarification and detailed requirements to assist agencies in carrying out the objectives of Executive Order 13642 of May 9, 2013, Making Open and Machine Readable the New Default for Government Information and OMB Memorandum M-13-13 Open Data Policy-Managing Information as an Asset. Specifically, this document focuses on near-term efforts agencies must take to meet the following five initial requirements of M-13-13, which are due November 30, 2013 (six months from publication of M-13-13):
Agencies will establish an open data infrastructure by implementing this guidance and Memorandum M-13-13 and taking advantage of the resources provided on Project Open Data. Once established, agencies will continue to evolve the infrastructure by identifying and adding new data assets1, enriching the description of those data assets through improved metadata, and increasing the amount of data shared with other agencies and the public.
At a minimum, a successful open data infrastructure must:
The “access level” categories described in this document are intended to be used for organizational purposes within agencies and to reflect decisions already made in agencies about whether data assets can be made public; simply marking data assets “public” cannot substitute for the analysis necessary to ensure the data can be made public. Agencies are reminded that this underlying data from the inventory may only be released to the public after a full analysis of privacy, confidentiality, security, and other valid restrictions pertinent to law and policy.
This guidance seeks to balance the need to establish clear and meaningful expectations for agencies to meet, while allowing sufficient flexibility on the approach each agency may take to address their own unique needs. This guidance also includes references to other OMB memoranda that relate to the management of information. Agencies should refer to the definitions included in the attachment in OMB Memorandum M-13-13 Open Data Policy-Managing Information as an Asset.
This guidance introduces an Enterprise Data Inventory framework to provide agencies with improved clarity on specific actions to be taken and minimum requirements to be met. It also provides OMB with a rubric by which to evaluate compliance and progress toward the objectives laid out in the Open Data Policy. Following the November 30, 2013 deadline, agencies shall report progress on a quarterly basis, and performance will be tracked through the Open Data Cross-Agency Priority (CAP) Goal. Meeting the requirements of this guidance will ensure agencies are putting in place a basic infrastructure for inventorying, managing, and opening up data to unlock the value created by opening up information resources.
To develop a clear and comprehensive understanding of what data assets they possess, Federal Agencies are required to create an Enterprise Data Inventory (Inventory) that accounts for all data assets created or collected by the agency. This includes, but is not limited to, data assets used in the agency’s information systems. The Inventory must be enterprise-wide, accounting for data assets across programs2 and bureaus3, and must use the required common core metadata available on Project Open Data. After creating the Inventory, agencies should continually improve the usefulness of the Inventory by expanding, enriching, and opening the Inventory (concepts described in the framework below).
The objectives of this activity are to:
Since agencies have varying levels of visibility into their data assets, the size and maturity of agencies’ Enterprise Data Inventories will differ across agencies. OMB will assess agency progress toward overall maturity of the Enterprise Data Inventory through the maturity areas of “Expand,” “Enrich,” and “Open.”
Expand: Expanding the inventory refers to adding additional data assets to the Inventory. Agencies should develop their own strategy to expand the inventory and break down the work according to agency-defined classes of data4. Agencies should communicate their plans for expanding the Inventory in the Inventory Schedule (described in the minimum requirements). As agencies develop an Inventory Schedule, they may find it helpful to group their data assets into classes of data. The following list provides examples of classes agencies may use as they schedule the expansion of the Inventory:
Example ways to evaluate “Expand” maturity: How has the Inventory expanded over time to include additional data assets? What “classes” of data (for example, financial, performance, scientific, regulatory, etc.) have been added or are planned to be added? Are all bureaus and programs represented in the Inventory? If not, what percentage is?*
Enrich: To improve the discoverability, management, and re-usability of data assets, agencies should enrich the Inventory over time by improving the quality of metadata describing each data asset. For example, agencies may:
Project Open Data provides metadata requirements, additional optional metadata fields, and examples of metadata areas (see Appendix for examples). To improve the management of IT systems through the Inventory, agencies are encouraged to include the Primary Related IT Investment Unique Investment Identifier (UII) as a metadata field. As they work to enrich data assets, agencies should carefully weigh the potential value of efforts to improve data description or increase the number of metadata fields against the potential associated burden. Agencies should work to avoid the risk of duplicative metadata and work toward adopting uniform schema. To that end, agencies should draw on the expertise of existing communities of practice8, review standard taxonomies9, and coordinate across the government to harmonize definitions when adopting additional metadata fields.
Example ways to evaluate “Enrich” maturity: How has the agency improved the quality of metadata for each record? Are effective keywords and clear language used in data descriptions? Are additional metadata fields applying best practices from Project Open Data? Has the agency developed policies and procedures for populating these fields consistently? Has the agency linked the Inventory to federal IT management by including the Primary Related IT Investment Unique Investment Identifier (UII)?*
Open: Agencies should implement tools and processes that will accelerate the opening of additional valuable data assets by making them public and machine-readable, while ensuring adequate policy, process, and technical safeguards are in place to prevent against the release of sensitive data. Agencies are required to increase the number of public data assets included in the Public Data Listing (described in the next section) over time. Agencies should work toward increasing the ratio of data that are public and machine-readable to data that can be made public as measured in the Inventory.
Example ways to evaluate “Open” maturity: How many releasable data assets have been released in the Public Data Listing? How have more data assets been released in accordance with the “open data” principles over time?*
To improve the discoverability and usability of data assets, all federal agencies must develop a Public Data Listing, which contains a list of all data assets that are or could be made available to the public. This Public Data Listing, posted at www.[agency].gov/data.json, would typically be a subset of the agency’s Inventory. This will allow the public to view agencies’ open data assets and subsequent progress as additional data assets are published.
Agencies, at their discretion, may choose to include entries for non-public data assets in their Public Data Listings, taking into account guidance in section D. For example, an agency may choose to list data assets with an ‘accessLevel’ of ‘restricted public’ to make the public aware of their existence and the process by which these data may be obtained.
Agencies’ Public Data Listings will be used to dynamically populate the newly renovated Data.gov, the main website to find data assets generated and held by the U.S. Government. Data.gov allows anyone from the public to find, download, and use government data. The upcoming re-launch of Data.gov (currently in beta at next.data.gov) will automatically aggregate the agency-managed Public Data Listings into one centralized location, using the common core metadata standards and tagging to improve the user ability to find and use government data.
The objectives of this activity are to:
Publish a Public Data Listing (by November 30, 2013)
Tools and Resources on Project Open Data
Identifying and engaging with key data customers to help determine the value of federal data assets can help agencies prioritize those of highest value for quickest release. Data customers include public as well as government stakeholders17. All Federal Agencies will be required to engage public input and reflect on how to incorporate customer feedback into their data management practices. Agencies may develop criteria at their discretion for prioritizing the opening of data assets, accounting for a range of factors, such as the quantity and quality of user demand, internal management priorities, and agency mission relevance. As customer feedback mechanisms and internal prioritization criteria will likely evolve over time and vary across agencies, agencies should share successful innovations in incorporating customer feedback through interagency working groups and Project Open Data to disseminate best practices. Agencies should regularly review the evolving customer feedback and public engagement strategy.
The objectives of this activity are to:
Establish Customer Feedback Mechanism (by November 30, 2013)
Tools and Resources on Project Open Data
The Open Data Policy requires agencies to strengthen and develop policies and processes to ensure that only the appropriate data are made available publicly. Agencies should work with their Senior Agency Official for Privacy and other relevant officials to ensure a complete analysis of issues that could preclude public disclosure of information collected or created. If the agency determines the data should not be made publicly available because of law, regulation, or policy or because the data are subject to privacy, confidentiality, security, trade secret, contractual, or other valid restrictions to release, agencies must document the determination in consultation with their Office of General Counsel or equivalent. The agency should designate one of three “access levels” for each data asset listed in the inventory: public, restricted public, and non-public. The descriptions of these categories can be found below and on Project Open Data.
The objectives of this activity are to:
As part of an agency’s analysis to assign a general access level to each data asset19, agencies should consult section ##III.4 of the OMB Memorandum M-13-13, and Executive Order 13556. Specifically, agencies are required to incorporate the National Institute of Standards and Technology (NIST) Federal Information Processing Standard (FIPS) Publication 199 “Standards for Security Categorization of Federal Information and Information Systems,” which includes guidance and definitions for confidentiality, integrity, and availability. Agencies should also consult with the Controlled Unclassified Information (CUI) program to ensure compliance with CUI requirements, the National Strategy for Information Sharing and Safeguarding and the best practices found in Project Open Data. In addition to complying with the Privacy Act of 1974, the Paperwork Reduction Act, the E-Government Act of 2002, the Federal Information Security Management Act (FISMA), and the Confidential Information Protection and Statistical Efficiency Act (CIPSEA), and other applicable laws, agencies should implement information policies based upon Fair Information Practice Principles, OMB guidance, and NIST guidance on Security and Privacy Controls for Federal Information Systems and Organizations.
Public: Data asset is or could be made publicly available to all without restrictions. The accesLevelComment field may be used to provide information on technical or resource barriers to increasing access to that data asset.
Restricted Public: Data asset is available under certain use restrictions. One example, among many, is a data asset that can only be made available to select researchers under certain conditions, because the data asset contains sufficient granularity or linkages that make it possible to re-identify individuals, even though the data asset is stripped of Personally Identifiable Information (PII). Another example would be a data asset that contains PII and is made available to select researchers under strong legal protections. This category includes some but not all data assets designated as Controlled Unclassified Information (CUI), consistent with Executive Order 13556. The accessLevelComment field must be filled in with details on how one can obtain access.
Non-Public: Data asset is not available to members of the public. This category includes data assets that are only available for internal use by the Federal Government, such as by a single program, single agency, or across multiple agencies. This category might include some but not all data assets designated as Controlled Unclassified Information (CUI), consistent with Executive Order 13556. Some non-public data assets may still potentially be available to other intra-agency operating units and/or other government agencies, as discussed in OMB Memorandum M-11-02: Sharing Data While Protecting Privacy. The accessLevelComment field for non-public datasets must contain an explanation for the reasoning behind why these data cannot be made public.
Describe Data Publication Process (by November 30, 2013)
Agencies should identify points of contact for the following roles and responsibilities related to managing information as an asset:
Report the point of contact for each of these roles and responsibilities via the E-Gov IDC20 by November 30, 2013
This section includes a high-level summary of agency actions and reporting requirements which are described in detail in the Policy Requirements section. Some requirements are one-time requirements, and others shall be updated quarterly as a part of the E-Gov IDC. This guidance uses three reporting channels:
|Agency Actions and Reporting Requirements||By Nov 30, 2013||After Nov 30, 2013||Page|
|A. Create and maintain an Enterprise Data Inventory (Inventory)|
|Develop an Inventory Schedule||•||5|
|Publish Inventory Schedule on the www.[agency].gov/digitalstrategy page24||•||5|
|Create an Enterprise Data Inventory||•||5|
|Submit Inventory Snapshot in a JSON format to the MAX Community25||•||5|
|Maintain the Enterprise Data Inventory: Expand, Enrich, Open||•||5|
|Update Inventory Snapshot quarterly in a JSON format in MAX Community26||•||5|
|Update the Inventory Schedule on the www.[agency].gov/digitalstrategy27 page, revise plans and describe actual results as each quarter completes||•||5|
|B.Create and maintain a Public Data Listing|
|Create and publish Public Data Listing in JSON format at www.[agency].gov/data.json||•||6|
|Maintain the Public Data Listing||•||6|
|C. Create a process to engage with customers to help facilitate and prioritize data release|
|Establish Customer Feedback Mechanism||•||7|
|Describe Customer Feedback Processes on www.[agency].gov/digitalstrategy28||•||8|
|Follow and update process as necessary||•||8|
|D.Document if data cannot be released|
|Develop Data Publication Process||•||10|
|Publish an overview of Data Publication Process on the www.[agency].gov/digitalstrategy29 page||•||10|
|Update process as necessary||•||10|
|E. Clarify roles and responsibilities for promoting efficient and effective data release|
|Report Point Of Contact of roles and responsibilities, including contact information for each listed responsibility in MAX Collect30||•||10|
|Update the Point Of Contact and contact information for each listed responsibility in MAX Collect as part of the quarterly E-Gov IDC31||•||10|
|Tagging: Reference Models and Controlled Vocabulary||These fields describe each data asset in terms which have been standardized government-wide. See Project Open Data for additional examples and best practices. Some examples include: FEAv2 Data Reference Model, FEAv2 Business Reference Model, OMB Budget Function Codes, Related Data.gov Community, Schema.org|
|Cross-Inventory Identifier Mapping||These fields describe related entries in other “Inventory” lists. Some examples include: Program (from OPPM’s Program Inventory), Related IT investment from FY2015 Exhibit 53 (UII), Related OIRA Information Collection Request, Related Performance.gov Agency Strategic Objective, Related Federal Data Center Consolidation Initiative data center ID|
|Information Quality||These fields describe any aspects of data quality evaluated by the agency, consistent with OMB’s Government-Wide Information Quality Guidelines (for example, the type of pre-dissemination review, use of existing standards, documents characterizing missing data in time, or spatial series).|
|Data Value||These fields describe internal and external use, reuse, and demand by customers and users.|
|Openness||These fields describe to what extent each data asset achieves the criteria for “open data” in M-13-13.|
1 Data Asset: A collection of data elements or datasets that make sense to group together. Each community of interest identifies the Data Assets specific to supporting the needs of their respective mission or business functions. Notably, a Data Asset is a deliberately abstract concept. A given Data Asset may represent an entire database consisting of multiple distinct entity classes, or may represent a single entity class.
2 Programs from the Federal Program Inventory: http://goals.performance.gov/federalprograminventory
3 Bureaus from OMB Circular A-11 Appendix C: http://www.whitehouse.gov/sites/default/files/omb/assets/a11_current_year/app_c.pdf
4 For example, by applying the categorizations of “bureau” and “business,” an agency might create classes of “bureau and business,” and choose to tackle “Bureau A, B, & C’s education grants-related data” first.
5 Information collection requests (ICR): http://www.reginfo.gov/public/jsp/PRA/praDashboard.jsp
6 For example:
7 IT Dashboard for Exhibit 53 and 300 reporting on IT investments: https://www.itdashboard.gov/
8 For example the statistical and geospatial communities have mature metadata standards
9 For example discipline specific
10 By following the instructions at: https://max.omb.gov/community/x/kIamK
11 By following the instructions at: https://max.omb.gov/community/x/kIamK
12 By following the instructions at: https://max.omb.gov/community/x/8YamK
13 By following the instructions at: https://max.omb.gov/community/x/kIamK
14 The value of “public” in the AccessLevel metadata field should be used to refer to a data asset that is or could be made publicly available to all without restrictions. This includes 1) data assets that have already been openly published online, and 2) data assets that have not yet been made publicly available but could be.
15 The presence of an accessURL value in a data asset’s metadata will indicate whether or not the data asset has been published or released. This avoids human error in manually updating the accessLevel field when there is an automatic, reliable means of determining the same thing.
16 Publicly available refers to data assets whose contents are downloadable from the Public Data Listing by the public via an accessURL.
17 Working with government stakeholders is encouraged through existing initiatives such as: * OMB Memorandum M-13-17 Next Steps in the Evidence and Innovation Agenda: http://www.whitehouse.gov/sites/default/files/omb/memoranda/2013/m-13-17.pdf * OMB Memorandum M-12-14, Use of Evidence and Evaluation in the 2014 Budget: http://www.whitehouse.gov/sites/default/files/omb/memoranda/2012/m-12-14.pdf * OMB Memorandum M-11-02 Sharing Data While Protecting Privacy: http://www.whitehouse.gov/sites/default/files/omb/memoranda/2011/m11-02.pdf
18 Agency Digital Government Strategy page by following the instructions at: https://max.omb.gov/community/x/kIamK
19 The inventory’s “access levels” should be implemented consistent with Executive Order 13556, which sets out the framework for designating the Controlled Unclassified Information (CUI) categories and subcategories that will serve as exclusive designations for identifying unclassified information throughout the Executive branch that requires safeguarding or dissemination controls, pursuant to and consistent with applicable law, regulations, and Government-wide policies.
20 By following the instructions at: https://max.omb.gov/community/x/uIemK
21 By following the instructions at: https://max.omb.gov/community/x/uIemK
22 By following the instructions at: https://max.omb.gov/community/x/8YamK
23 By following the instructions at: https://max.omb.gov/community/x/kIamK
24 By following the instructions at: https://max.omb.gov/community/x/kIamK
25 By following the instructions at: https://max.omb.gov/community/x/8YamK
26 By following the instructions at: https://max.omb.gov/community/x/8YamK
27 By following the instructions at: https://max.omb.gov/community/x/kIamK
28 By following the instructions at: https://max.omb.gov/community/x/kIamK
29 By following the instructions at: https://max.omb.gov/community/x/kIamK
30 By following the instructions at: https://max.omb.gov/community/x/uIemK
31 By following the instructions at: https://max.omb.gov/community/x/uIemK