Backup and Data Recovery
Introduction
Backups at SPU
What is Backed Up?
Backups target virtual machines (hereafter VMs) and Distributed File Systems (DFS) stores that are centrally managed by CIS.
- Examples of VMs include Banner, Talisma, PacketFence and AD Domain Controllers. Some of these VMs are deemed as "critical," meaning that they comprise core institutional data systems or integral infrastructure systems; other VMs are not critical but convenient for some level of business operations. Critical and non-critical VMs are backed up differently as detailed below.
- Examples of DFS resources include departmental file shares (aka Matthew) and documents stored by faculty and staff on SPU managed end-user systems - specifically the "My Documents" and "Desktop" folders.
- Banner is backed up on the same schedule as other systems (see VMs above), however additional backups of the Banner database are taken on 10th Day of Fall, Winter, and Spring terms. These backups are retained for 7 years to preserve academic and financial records and history.
- It is important to note here that certain elements of the institutional record are not stored on campus, but rather are hosted off-site in cloud-based service facilities. Examples of these include O365 (email, SharePoint, etc) and Canvas (on line learning system). These systems are not backed up by CIS beyond what the hosting service provides.
WHERE are data backed up?
Our backup strategy first calls for geographically separate locations on redundant hardware platforms housed on-campus. These local backups ensure fast and economical data recover-ability in the event of limited casualties that affect either the building/room or hardware platform on which the primary data source resides. Local backup of VMs and copies of DFS stores are maintained on campus in a separate building from their production server location. In this manner, if a problem occurs on the primary data host, recovery of affected systems may be done most economically and expediently in-house.
In addition to local copies, critical data are also backed up to "Cloud" or Internet-based storage facilities, locations which are far away from the SPU geographic campus. Cloud facilities feature high levels of redundancy, scale-ability and up-time - oftentimes well beyond that which we can provide locally. Typically, data "pushed to the cloud" are available for worst-case recovery scenarios - those instances when all local data (both primary and secondary backups) are destroyed and/or unrecoverable. Retrieval of said data is considerably more expensive and involves longer time periods for host/data restoration; this cloud storage is therefore seen as "last resort recovery."
HOW are data backed up?
There are several different methods currently deployed in our data backup process: Veeam Enterprise Manager (VEM), Automated Script (DFS), and CloudBerry (DFS ).
- Veeam "Enterprise Manager" provides a robust, managed process to schedule and coordinate backup jobs both locally (Marston & Demaray ) and to cloud service locations. Backups via Veeam boast nightly, automated processes with advanced reporting to alert system administrators when errors occur. Veeam encrypts target data (both at rest and in motion) to ensure confidentiality and integrity.
- Veeam server (VM) is used to backup VMs (and their application data) to either non-critical or critical servers; backup copy jobs from each of these hosts are replicated to yet another server. The critical server is also used to transfer critical VM backups to cloud storage - currently Azure.
- Automated Scripts: A second method of backup involves the copying of DFS shares and permissions from their hosted location in the CIS server room to dedicated hardware in Demaray Hall via automated scripts that are written and maintained by CIS staff.
- CloudBerry: A final backup method is called "CloudBerry." CloudBerry is used to transfer DFS files from Dell servers to cloud storage (currently hosted by BackBlaze). BackBlaze is simple file-level storage - a desirable method to contain costs associated with retrieval/recovery individual files from the cloud.
- Banner 10th Day snapshots are backed up via a manual process and then transmitted to the cloud for long-term off-site preservation. As the title implies, snapshots are taken on the 10th day of Fall, Winter and Spring quarters. Presently, the University utilizes the BackBlaze Cloud Service for storage of long-term 10th day statistics.
The architecture for our on-prem and cloud backups is graphically illustrated here: Backup Diagrams (restricted access)
WHEN are data backed up?
- All VMs and DFS shares are backed up nightly. Banner 10th stats are backup up separately on or shortly after the 10th day of the quarter.
- Local backups for DFS shares occur nightly via scripted processes between hosts in Marston and Demaray. Lower cost storage (Dell) is currently used for DFS local back up.
- Off-site transfers: When initiating cloud services, an initial "seeding" of the cloud storage takes place. This process may involve high bandwidth and resource commitments as files must be transferred to the cloud. Similarly, initial seeding of new local hosts may be time and resource intensive depending on the application and data at play.
- Once the initial seeding is complete, nightly backups occur incrementally; i.e., only change-data is transmitted thereby decreasing transfer and processing costs associated with the backup process.
How Long? Backup Retention Schedule
- Backups of all VMs are retained for up to 60 days locally. What this means is that at any given time we can restore backed-up VMs at multiple points within the 60 day window. There are no locally stored restore points beyond 60 days.
- "Critical" VMs are also backed-up to the cloud: these are retained for a period of 60 days.
- DFS shares backed-up to the cloud include incremental backups for 30 days. At any given time we can restore backed-up DFS files at multiple points within the 30 day window. Currently we utilize the BackBlaze Cloud Storage facility for DFS off-site backups.
- Banner 10th Day Backup: Banner is backed up on the same schedule as other systems, however an additional snapshot of the Banner database is taken on 10th Day of Fall, Winter, and Spring quarters. These backups are retained for a minimum of 7 years to preserve academic and financial records and history.
What Constitutes "Critical" VM Designation?
As noted above, only VMs that are classified as "critical" are backed up to our cloud provider. While the determination of "critical" is made on a case-by-case basis, the following criteria are to be considered in the designation process:
- Legal: Servers/resources that contain legal data involving the University's formal business, academic, or financial records for which the University is obligated to follow statutory requirements for retention and auditing.
- Operational: Servers/resources that hold data needed as part of ongoing operations and that would be either impossible or excessively expensive to recreate.
- Infrastructure Services: Servers/resources supporting critical infrastructure services with complex system configurations, event audit logs, and/or system dependencies.
Data security
With the exception of CIS Systems Administrators who administer the backup system, only individuals have access to their institutional data stored on University resources. CIS System Administrators must have access to the data in order to operate the service, however, they are bound by strict confidentiality agreements (see Privileged System Access Policy]) and by University policy to protect the security of the data. All backups are kept in a secure facility on campus and later off-site at cloud-hosted service providers who are contractually obligated to ensure the confidentiality, availability and integrity of University data.