PSO behavior

If a disaster occurs, an entire site can become unrecoverable; it is referred in ECS as a permanent site outage (PSO). ECS treats the unrecoverable site as a temporary site failure, but only if the entire site is down or unreachable over the WAN. If the failure is permanent, the System Administrator must permanently fail over the site from the federation to initiate failover processing. This initiates resynchronization and reprotection of the objects that are stored on the failed site. The recovery tasks that are run as a background process. For more information about how to perform the failover procedure in the ECS Portal, see Fail a VDC (PSO).

  • Before triggering PSO (planned or unplanned), ensure that the site is off (all nodes are shut down). Ensure you fail the site from the federation and remove the site from all the replication groups.
  • If you want to reuse the same racks from the PSOed site, disconnect the racks physically and the nodes should be reimaged before you bring them online.
  • IPs or FQDN hostnames should not be reused after they have been PSOed from a system.
  • ECS supports multi-site PSO. Multi-site PSO in ECS is limited to full replication RGs, where all sites have all user data. To perform a Multi-site PSO, contact ECS Remote Support.

Before you initiate a PSO in the ECS Portal, it is advised to contact your technical support representative, so that the representative can validate the cluster health. Data is not accessible until the failover processing is completed. You can monitor the progress of the failover processing on the Monitor > Geo Replication > Failover Processing tab in the ECS Portal. While the recovery background tasks are running, but after failover processing has completed, some data from the removed site might not be read back until the recovery tasks fully complete.