Contingency planning guide for federal information systems



National Archives Catalog (NAC)Security Categorization: ModerateInformation System Contingency Plan (ISCP)Date: April 15, 2020Version #: 7.0Prepared byNational Archives and Records Administration8601 Adelphi RoadCollege Park, MD 20740Table of Contents TOC \o "1-3" \h \z \u Plan Approval PAGEREF _Toc37847404 \h 11.0Introduction PAGEREF _Toc37847405 \h 21.1Background PAGEREF _Toc37847406 \h 21.2Scope PAGEREF _Toc37847407 \h 21.3Assumptions PAGEREF _Toc37847408 \h 22.0Concept of Operations PAGEREF _Toc37847409 \h 32.1System Description PAGEREF _Toc37847410 \h 32.2Overview of Three Phases PAGEREF _Toc37847411 \h 32.3Roles and Responsibilities PAGEREF _Toc37847412 \h 43.0Activation and Notification PAGEREF _Toc37847413 \h 53.1Activation Criteria and Procedure PAGEREF _Toc37847414 \h 53.2Notification PAGEREF _Toc37847415 \h 53.3Outage Assessment PAGEREF _Toc37847416 \h 64.0Recovery PAGEREF _Toc37847417 \h 64.1Sequence of Recovery Activities PAGEREF _Toc37847418 \h 64.2Recovery Procedures PAGEREF _Toc37847419 \h 64.3Recovery Escalation Notices/Awareness PAGEREF _Toc37847420 \h 75.0Reconstitution PAGEREF _Toc37847421 \h 75.1Validation Data Testing PAGEREF _Toc37847422 \h 75.2Validation Functionality Testing PAGEREF _Toc37847423 \h 75.3Recovery Declaration PAGEREF _Toc37847424 \h 75.4Notifications (users) PAGEREF _Toc37847425 \h 75.5Cleanup PAGEREF _Toc37847426 \h 75.6Offsite Data Storage – Moderate Availability Rating PAGEREF _Toc37847427 \h 75.7Data Backup PAGEREF _Toc37847428 \h 85.8Event Documentation PAGEREF _Toc37847429 \h 85.9Deactivation PAGEREF _Toc37847430 \h 8APPENDIX A: PERSONNEL CONTACT LIST PAGEREF _Toc37847431 \h 9APPENDIX B: VENDOR CONTACT LIST PAGEREF _Toc37847432 \h 10APPENDIX C: DETAILED RECOVERY PROCEDURES PAGEREF _Toc37847433 \h 11APPENDIX D: ALTERNATE PROCESSING PROCEDURES PAGEREF _Toc37847434 \h 12APPENDIX E: SYSTEM VALIDATION TEST PLAN PAGEREF _Toc37847435 \h 13APPENDIX F: ALTERNATE STORAGE, SITE AND TELECOMMUNICATIONS PAGEREF _Toc37847436 \h 14APPENDIX G: DIAGRAMS (SYSTEM AND INPUT/OUTPUT) PAGEREF _Toc37847437 \h 15APPENDIX H: HARDWARE AND SOFTWARE INVENTORY PAGEREF _Toc37847438 \h 16APPENDIX I: INTERCONNECTIONS PAGEREF _Toc37847439 \h 17APPENDIX J: TEST AND MAINTENANCE SCHEDULE PAGEREF _Toc37847440 \h 18APPENDIX K: ASSOCIATED PLANS AND PROCEDURES PAGEREF _Toc37847441 \h 19APPENDIX L: BUSINESS IMPACT ANALYSIS PAGEREF _Toc37847442 \h 20APPENDIX M: DOCUMENT CHANGE PAGE PAGEREF _Toc37847443 \h 21Plan ApprovalIn accordance with National Archives Records Administration’s (NARA) contingency planning policy, I hereby affirm that the contingency plan is complete and has been tested sufficiently. The designated authority is responsible for continued maintenance and testing of the ISCP. As the designated authority for the National Archives Catalog (NAC) system, I hereby certify that the information system contingency plan (ISCP) is complete, and that the information contained in this ISCP provides an accurate representation of the application, its hardware, software, and telecommunication components. I further certify that this document identifies the criticality of the system as it relates to the mission of NARA, and that the recovery strategies identified will provide the ability to recover the system functionality in the most expedient and cost-beneficial method in keeping with its level of criticality.I further attest that this ISCP for NAC will be tested at least annually. This plan was last tested on March 25, 2020; the test, training, and exercise (TT&E) material associated with this test can be found in Xacta. This document will be modified as changes occur and will remain under version control, in accordance with NARA’s contingency planning policy._________________________________________________Jason ClingermanDateSystem Owner1.0IntroductionInformation systems are vital to NARA mission/business processes; therefore, it is critical that services provided by the NAC system are able to operate effectively without excessive interruption. This Information System Contingency Plan (ISCP) establishes comprehensive procedures to recover NAC quickly and effectively following a service disruption.1.1BackgroundThis NAC ISCP establishes procedures to recover NAC following a disruption. The following recovery plan objectives have been established:Maximize the effectiveness of contingency operations through an established plan that consists of the following phases:Activation and Notification phase to activate the plan and determine the extent of damage.Recovery phase to restore NAC operations; andReconstitution phase to ensure that NAC is validated through testing and that normal operations are resumed.Identify the activities, resources, and procedures to carry out NAC processing requirements during prolonged interruptions to normal operations.Assign responsibilities to designated NARA personnel and provide guidance for recovering NAC during prolonged periods of interruption to normal operations.Ensure coordination with other personnel responsible for NARA contingency planning strategies. Ensure coordination with external points of contact and vendors associated with NAC and execution of this plan.1.2ScopeThis ISCP has been developed for NAC, which is classified as a moderate impact system, in accordance with Federal Information Processing Standards (FIPS) 199 – Standards for Security Categorization of Federal Information and Information Systems. Procedures in this ISCP are for moderate impact systems and designed to recover NAC within 48 hours (2 business days). This plan does not address replacement or purchase of new equipment; short-term disruptions lasting less than 48 hours; or loss of data at the onsite facility or at the user-desktop levels. 1.3AssumptionsThe following assumptions were used when developing this ISCP:NAC has been established as a moderate-impact system, in accordance with FIPS 199.NAC includes services provided by the Amazon Web Services (AWS) cloud service provider.If the Availability of NAC is Moderate, then an alternate processing site and offsite storage is required and has been established for NAC as discussed below:Current backups of the system software and data are intact and available at the offsite storage facility in provided by Amazon Web Services (AWS). Alternate facilities have been established at AWS and are available if needed for relocation of NAC. The NAC is inoperable at AWS and cannot be recovered within 48 hours (2 business days).The Recovery Time Objective (RTO) for NAC is 48 hours.Key NAC personnel have been identified and trained in their emergency response and recovery roles; they are available to activate the NAC ISCP.Additional assumptions as appropriate.The NAC ISCP does not apply to the following situations:Overall recovery and continuity of mission/business operations. The Business Continuity Plan (BCP) and Continuity of Operations Plan (COOP) address continuity of mission/business operations.Emergency evacuation of personnel. The Occupant Emergency Plan (OEP) addresses employee evacuation.Any additional constraints and associated plans should be added to this list.2.0Concept of OperationsThe Concept of Operations section provides details about NAC, an overview of the three phases of the ISCP (Activation and Notification, Recovery, and Reconstitution), and a description of roles and responsibilities of NARA personnel during a contingency activation.2.1System DescriptionThe NAC system was created by NARA to serve as the primary method for search, access, and distribution of publicly available, digital NARA content. At a high level, NAC will:Hold a copy of publicly available NARA digital contentProvide methods for downloading this content by the publicMaintain and make available renditions (for example, images at different resolutions) of the content designed to make the content more useful for the public.Provide methods to search this content and associated metadata so that users can easily find the content they wish to access. This will include:Maintaining a search engine index over the content;Providing a search user interface to the content;Providing APIs and other methods for end-users to access the content programmatically.NAC is installed and running in the AWS US East/West region. AWS East/West is FedRAMP approved cloud service provider. AWS leverages the Infrastructure-as-a-Service (IaaS) cloud computing model, which enabled on-demand Internet access to a shared pool of configurable computing resources such as servers, storage, network infrastructure, and various other web services.? Customers can provision or release computing resources on demand.2.2Overview of Three PhasesThis ISCP has been developed to recover and reconstitute the NAC using a three-phased approach. This approach ensures that system recovery and reconstitution efforts are performed in a methodical sequence to maximize the effectiveness of the recovery and reconstitution efforts and minimize system outage time due to errors and omissions.The three system recovery phases are:Activation and Notification Phase – Activation of the ISCP occurs after a disruption or outage that may reasonably extend beyond the RTO established for a system. The outage event may result in severe damage to the facility that houses the system, severe damage or loss of equipment, or other damage that typically results in long-term loss.Once the ISCP is activated, system owners and users are notified of a possible long-term outage, and a thorough outage assessment is performed for the system. Information from the outage assessment is presented to system owners and may be used to modify recovery procedures specific to the cause of the outage.Recovery Phase – The Recovery phase details the activities and procedures for recovery of the affected system. Activities and procedures are written at a level that an appropriately skilled technician can recover the system without intimate system knowledge. This phase includes notification and awareness escalation procedures for communication of recovery status to system owners and users.Reconstitution –The Reconstitution phase defines the actions taken to test and validate system capability and functionality at the original or new permanent location. This phase consists of two major activities: validating successful reconstitution and deactivation of the plan.During validation, the system is tested and validated as operational prior to returning operation to its normal state. Validation procedures may include functionality or regression testing, concurrent processing, and/or data validation. The system is declared recovered and operational by system owners upon successful completion of validation testing.Deactivation includes activities to notify users of system operational status. This phase also addresses recovery effort documentation, activity log finalization, incorporation of lessons learned into plan updates, and readying resources for any future events.2.3Roles and ResponsibilitiesThe ISCP establishes several roles for NAC recovery and reconstitution support. Persons or teams assigned ISCP roles have been trained to respond to a contingency event affecting NAC.Contingency Plan RoleNAC RoleResponsibilitiesISCP Director / ISCP Director (Alternate)NAC System Owner Overall management of the ISCPConfirming severity of a system disruption with the ISCP CoordinatorFormal activation of the ISCPNotifying the ISCP Coordinator to begin formal assessment of the system disruption and develop recovery strategiesNotifying the ISCP Coordinator to assemble the ISCP Recovery Teams and begin system recoveryOverseeing annual testing, maintenance, and distribution of the planContacting vendors, contractors or other external organizations to assist in the system recovery as necessaryMaking initial assessment of system disruption (i.e., is it a minor system failure or a catastrophic event/major system failure)For a minor system failure:Assure that the incident is reported to NARA IT Operations, and is logged in the trouble ticket system.Assess the system disruptionEstimate system recovery time and communicate this information to the ISCP DirectorContact and instruct all necessary ISCP Recovery Team members to recover the failing system component(s)For a catastrophic event/major system failure:Initiate full activation of the ISCPAssess the system disruption and develop recovery recommendations, providing thorough assessment of catastrophic events/major system failuresDevelop the damage assessment report and recommend recovery and resumption strategies to the ISCP Director for review and considerationContact all necessary ISCP Recovery Team Members and instruct them to assemble their teams to recover the failing system component(s)Coordinate communications between the ISCP Recovery Teams and ISCP Director in recovering the systemComplete an after-action report upon resumption of normal operationsEnsure the annual testing, maintenance, and distribution of the planISCP Coordinator / ISCP Team MemberAll necessary technicians, administrators, and programmers from the major divisionsAssisting in all recovery and resumption activities for minor system failures, as necessaryAssisting in all recovery and resumption activities for catastrophic events/major system failures, as necessary3.0Activation and NotificationThe Activation and Notification Phase defines initial actions taken once a NAC disruption has been detected or appears to be imminent. This phase includes activities to notify recovery personnel, conduct an outage assessment, and activate the ISCP. At the completion of the Activation and Notification Phase, NAC ISCP staff will be prepared to perform recovery measures.3.1Activation Criteria and ProcedureThe NAC ISCP may be activated if one or more of the following criteria are met:The cloud service provider indicates an outage that will exceed the RTO of 48 hoursThe type of outage indicates NAC will be down for more than 48 hours;The facility housing NAC is damaged and may not be available within 48 hours; and,Other criteria, as appropriate.The following persons or roles may activate the ISCP if one or more of these criteria are met:ISCP Director ISCP Director – Alternate 3.2NotificationThe first step upon activation of the NAC ISCP is notification of NARA Helpdesk, appropriate mission/business and system support personnel. Contact information for appropriate POCs is included in Appendix A, Personnel Contact List and Appendix B, Vendor Contact List. Notification of outage incidents for NAC that require activation of the ISCP will be performed through phone or email. The following information should be relayed to individuals during the notification phase:Nature of the emergency that has occurred or is impending;Loss of life or injuries;Any known damage estimates;Response and recovery details;Where and when to convene for briefing or further response instructions;Instructions to prepare for relocation for estimated time period (if applicable);Instructions to complete notifications (if applicable).Notification will also be provided to the system stakeholders when the issue is resolved.3.3Outage AssessmentThe cloud service provider is responsible for the outage assessment if it is within the scope of their services. That assessment will include the extent of the disruption and expected recovery time.If the outage is outside the scope of the cloud service provider, a thorough outage assessment is necessary to determine the extent of the disruption, any damage, and expected recovery time. This outage assessment is conducted by the technical team. Assessment results are provided to the ISCP Coordinator to assist in the coordination of the recovery of NAC.Outage Assessment includes activities to determine:The cause of the disruption; Potential for additional disruption or damage. 4.0RecoveryThe Recovery Phase provides formal recovery operations that begin after the ISCP has been activated, outage assessments have been completed (if possible), personnel have been notified, and appropriate teams have been mobilized. Recovery Phase activities focus on implementing recovery strategies to restore system capabilities, repair damage, and resume operational capabilities at the original or an alternate location. At the completion of the Recovery Phase, NAC will be functional and capable of performing the functions identified in Section 2.1 of this plan.4.1Sequence of Recovery ActivitiesThe cloud service provider is responsible for the sequence of recovery activities and will report status as appropriate to the ISCP Director.4.2Recovery ProceduresSince NAC is a cloud-hosted architecture, the recovery of business operations depends on the nature of the contingency. In the event that the cloud provider incurs a partial loss, the cloud provider will default to their back up systems with no impact to the NARA user.In the event that the NARA workspace becomes unavailable, the NARA user is able to access the cloud-hosted system from any other physical environment so long as there is Internet access.NARA does not maintain any alternate processing capabilities or procedures in the event that the cloud service provider is unable to provide service. In the event of an outage, NARA will work with the provider to restore service. In the event that the cloud service provider is no longer able to provide service, the NAC business functions are suspended until a new vendor is identified and funded.4.3Recovery Escalation Notices/AwarenessWhile the recovery effort is underway, hourly status notification will be provided to the ISCP Director, ISCP Coordinator, and all appropriate stakeholders.5.0ReconstitutionReconstitution is the process by which recovery activities are completed and normal system operations are resumed. If the original facility is unrecoverable, the activities in this phase can also be applied to preparing a new permanent location to support system processing requirements. A determination must be made on whether the system has undergone significant change and will require reassessment and reauthorization. The phase consists of two major activities: validating successful reconstitution and deactivation of the plan.5.1Validation Data TestingValidation data testing is the process of testing and validating data to ensure that data files or databases have been recovered completely at the permanent location. Detailed validation test procedures are provided in Appendix E, System Validation Test Plan. 5.2Validation Functionality TestingValidation functionality testing is the process of verifying that NAC functionality has been tested, and the system is ready to return to normal operations. Detailed functionality test procedures are provided in Appendix E, System Validation Test Plan. 5.3Recovery DeclarationUpon successfully completing testing and validation, the system owner will formally declare recovery efforts complete, and that NAC is in normal operations. NAC business and technical POCs will be notified of the declaration by the ISCP Coordinator.5.4Notifications (users)Upon return to normal system operations, NAC users will be notified by the ISCP Director or ISCP Director - Alternate using predetermined, and the most appropriate notification procedures (e.g., email, broadcast message, phone calls, etc.).5.5CleanupCleanup is the process of cleaning up or dismantling any temporary recovery locations, restocking supplies used, returning manuals or other documentation to their original locations, and readying the system for a possible future contingency event.Materials, plans, and equipment used during the recovery and testing must be returned to storage or their proper location. All sensitive materials must be destroyed or properly returned to safe storage, as appropriate. Any personnel temporarily assisting other office locations during the disruption should be instructed by their respective team leaders to conclude their assistance and report to their primary sites and duties.5.6Offsite Data Storage – Moderate Availability RatingIt is important that all backup and installation media used during recovery be returned to the offsite data storage location.5.7Data BackupAs soon as reasonable following recovery, the cloud service provider will complete a new data backup.5.8Event DocumentationIt is important that all recovery events be well-documented, including actions taken and problems encountered during the recovery and reconstitution effort, and lessons learned for inclusion and update to this ISCP. It is the responsibility of each ISCP team or person to document their actions during the recovery and reconstitution effort, and to provide that documentation to the ISCP Coordinator.Types of documentation that should be generated and collected after a contingency activation include:Activity logs (including recovery steps performed and by whom, the time the steps were initiated and completed, and any problems or concerns encountered while executing activities);Functionality and data testing results;Lessons learned documentation; and,After Action Report (including identification of any new components including all information applicable to the Configuration Management (CM) documentation.5.9DeactivationOnce all activities have been completed and documentation has been updated, the ISCP Director/ISCP Coordinator will formally deactivate the ISCP recovery and reconstitution effort. Notification of this declaration will be provided to NARA Helpdesk, all business and technical POCs.The following procedures will be followed to deactivate the ISCP for NAC:Verify that the application is functioning rm vested parties that the application has been restored and is functioning properly.Log details of the event and problems encountered with the ISCP.Incorporate problem solutions into later versions of the ISCP.Any personnel temporarily assisting other office locations during the disruption should be instructed by their respective team leaders to conclude their assistance and report to their primary sites and duties.APPENDIX A: PERSONNEL CONTACT LISTNAC ISCP Key PersonnelKey PersonnelContact InformationKey ersonnelContact InformationISCP DirectorWork301-837-3022Jason Clingerman System OwnerHome8601 Adelphi Rd. CellularCollege Park, MD 20740Emailjason.clingerman@ISCP Director – AlternateWork301-837-3024Richard Steinbacher Contracting Officer Representative HomeCellularEmailrichard.steinbacher@ISCP CoordinatorWork301-837-3022Jason Clingerman System OwnerHomeCellularEmailjason.clingerman@ISCP Coordinator – AlternateWork301-837-3024Richard SteinbacherHomeCellularEmailrichard.steinbacher@ICSP Team – Team MembersAdil Latiwala – Technical Point of ContactWork301-837-3161HomeCellular240-593-5831Emailadil.latiwals@Gang ChenWorkHomeCellularEmailgang.chen@Information System Security OfficerWork301-837-0430Anton DavisHomeCellular301-755-7026Emailwilliam.davis@NARA HelpdeskWork703-872-7755EmailITsupport@APPENDIX B: VENDOR CONTACT LISTVendor Contact ListKey PersonnelVendorComponent/ServiceContact InformationAmazon Web Services Cloud866 216-1072Aspire Content Processing703 953-2791SplunkSearch, monitor and analyze data855 775-8657InfoRelianceCloud844 458-5433DSAService703 748-7601APPENDIX C: DETAILED RECOVERY PROCEDURESThe cloud service provider is responsible for recovery procedures.APPENDIX D: ALTERNATE PROCESSING PROCEDURESNARA does not maintain any alternate processing capabilities or procedures in the event that the Cloud Service Provider (CSP) is unable to provide service. In the event of an outage, NARA will work with the CSP to restore service. In the event that the CSP is no longer able to provide service, the NAC business functions are suspended until a new vendor is identified and funded.APPENDIX E: SYSTEM VALIDATION TEST PLANOnce the system has been recovered, the following steps will be performed to validate system data and functionality:The NAC system administrator tests for basic NAC functionality during regular operation.?RestorationThe NAC system administrator restored operations to the production serverTest FunctionalityTest functionality of NARA NAC application: login, search and retrieval capabilities.Assessment proceduresHave the NAC system administrator use a NARA workstation to access the NAC application and perform the outlined tasksPersonnel & toolsNAC system administrator, NARA workstation.Successful outcomeLogin, search and retrieve are all functioning correctly.Project DocumentationStandard Operating Procedures ?The ISCPD or alternate informs all users of the restoration of the system.Test OutcomeThe production application is online and functional, which is determined by connecting and testing NAC applications functionality.Users NotificationContact all users and inform them of the restoration of the system.Notification ProcessAsk the system administrator to email all NAC users, notifying them of the resolution to the aforementioned issues.Personnel & toolsISCP DirectorSuccessful outcomeAll active NAC users are notified by e-mail that the NAC system is back online.Project DocumentationNAC User Group E-mail List, or NARA personnel mailing list.APPENDIX F: ALTERNATE STORAGE, SITE AND TELECOMMUNICATIONSThe cloud service provider maintains alternate storage and failover capabilities for the service they provide. This is a requisite to the FedRAMP certification. There is no additional NARA alternate storage capability.APPENDIX G: DIAGRAMS (SYSTEM AND INPUT/OUTPUT)The NAC system diagram is provided below. APPENDIX H: HARDWARE AND SOFTWARE INVENTORYThe inventory is maintained in Xacta and copied below.APPENDIX I: INTERCONNECTIONSNAC receives weekly export data from DAS for the records created, modified, or deleted from the previous week for ingestion into NAC. The data is in XML format. APPENDIX J: TEST AND MAINTENANCE SCHEDULEStepDate Due byResponsible PartyDate ScheduledDate HeldIdentify tabletop facilitator.March 2020ISCP CoordinatorMarch 2020March 2020Develop tabletop test plan.March 2020Tabletop FacilitatorMarch 2020March 2020Invite participants.March 2020Tabletop FacilitatorMarch 2020March 2020Conduct tabletop test.March 2020Facilitator, ISCP Coordinator, POCsMarch 25, 2020March 25, 2020Finalize after action report and lessons learned.April 2020ISCP CoordinatorApril 2020April 2020Update ISCP based on lessons learned.April 2020ISCP CoordinatorApril 2020April 2020Approve and distribute updated version of ISCP.May 2020ISCP Director, ISCP CoordinatorMay 2020May 2020APPENDIX K: ASSOCIATED PLANS AND PROCEDURESArtifacts related to plans and procedures such FIPS 199, BIA, and System Security Plan (SSP) are maintained as separate documents and can be found in Xacta. APPENDIX L: BUSINESS IMPACT ANALYSISThe Business Impact Analysis is maintained in Xacta and copied below.APPENDIX M: DOCUMENT CHANGE PAGEModifications made to this plan since the last printing are as follows:Document VersionDescription of contents / revisionEditorChange Date0.1Initial draft07/20091.0Final 08/14/20092.0Revision – modifications to include OPA and CERA Systems05/27/2011 3.0Updated contact names and numbers08/09/20113.1Annual review and update09/30/20134.0Annual review and update05/14/20144.1Annual review and update03/27/20155.0Review and update09/23/20166.0Review and updateISSO - John M. Nelson05/12/20197.0FY20 Annual update. RTO in sections 1.2, 1.3, and 3.1 updated to match the FY20 BIA. Notifications updated to include NARA Helpdesk. Appendix A Contact List updated. Inventory and BIA updated with FY20 information. Test and maintenance dates updated to reflect the FY20 tabletop exercise.ISSO – Anton Davis04/15/2020 ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download