Can’t find what you’re looking for?
Want to know more about the Nectar Cloud, how it works, who can use it, how to join, or who to contact for help? See the below Frequently Asked Questions for more information.
Introduction to cloud
What is Cloud Computing?
Cloud computing is a metaphor for doing computing tasks on a computer infrastructure run by someone else “on the internet”. (The origins of the term are uncertain, and there is no single precise definition.) The difference between cloud computing and the classic IT service model is that the infrastructure you use is typically owned and run by service providers that are external to your organization.
What is a Virtual Machine?
Virtual machines (VM’s) allow a physical computer to be shared among a number of users, with each user appearing to have exclusive access to the machine. Virtual machines are typically implemented using software known as a “hypervisor” which mediates each virtual computer’s access to the physical computer hardware, and stops the VMs from interfering with each other.
Is cloud computing like HPC?
Not really. Typical cloud computing systems are built using standard computing hardware that is optimized for economical performance rather than for speed. By contrast, High Performance Computing (HPC) systems tend to provide high-end processors, providing some combination of large numbers of cores, lots of memory, high-performance inter-processor communication and high performance disk I/O.
Despite this, a lot of computational tasks that run on HPC systems will run just fine on a cloud computing facility. If you want advice on this, please contact QRIScloud support, and we can arrange for an eResearch Analyst to look at your computational problem, and help you figure out the best way to address it.
What is a Nectar Cloud computing?
The Nectar Cloud is a federation of cloud computing facilities located in each of the Australian State capital cities, and Canberra. The infrastructure is implemented and managed using the OpenStack cloud computing framework.
What is OpenStack?
“OpenStack is a set of software tools for building and managing cloud computing platforms for public and private clouds. Backed by some of the biggest companies in software development and hosting, as well as thousands of individual community members, many think that OpenStack is the future of cloud computing. OpenStack is managed by the OpenStack Foundation, a non-profit which oversees both development and community-building around the project” – source.
Nectar Cloud project trials
What is a project trial?
A Project Trial (PT) is a Nectar project with limited resources and time-span that is intended to let you try out the cloud before you commit to using it. A PT has the resources for running up to 2 instances using up to 2 VCPUs, and a time limit of 3 months.
How do I get a project trial?
Simply visit the Nectar dashboard. You will first be directed to your home institution’s AAF login page. Then you will be asked to read the Nectar terms and conditions. Finally, a project trial project will be created automatically for you.
How do I apply for Nectar Cloud resources?
Visit the Nectar dashboard, and fill in and submit an application using the Request an Allocation page. You will need to set out your resource requirements and your project duration, and provide a research description and a technical justification for your resource request. Nectar Cloud resources are allocated based on the research and technical merit of your application, the resources you are applying for, and resource availability.
We encourage you to contact your local Cloud contact for support if you need help in making the application. Then an eResearch analyst can advise and assist you.
What is a Nectar Allocation?
A Nectar allocation is effectively permission for you and your team to use up to a certain level of Nectar cloud resources over a particular period of time. The allocation provides the resource quotas for a Nectar project.
What resources should I apply for
The basic computational resources that you need to apply for are Instances, VCPUs and VCPU-hours. The basic computational resources come with a modest amount of disk storage (see Flavor) that will be associated with your virtual machines. In addition, you can apply for VM independent Nectar storage in the form of Object Storage and/or Volume Storage.
Does a Nectar allocation guarantee me access?
Unfortunately, no. A Nectar allocation gives you quotas for a given number of Instances and VCPUs. However, when you attempt launch an Instance, it can fail with this message:
Failed to launch instance [Error: No valid host was found.]
This can be caused by a variety of things, but a common cause is that OpenStack could not find the required number of free cores or the required amount of memory in the specified Availability Zone. If this happens, you could try launching a smaller Instance, or launching in a different (less full) Availability Zone.
What is an Instance?
An “Instance” is Nectar terminology for a virtual machine running on the Nectar Cloud OpenStack infrastructure. An “Instance” runs on a “compute node”; i.e. a physical computer populated with processor chips, memory chips and so on. OpenStack does not support instances that span multiple compute nodes, so the theoretical maximum dimensions of an instance are determined by the compute node hardware.
What is a VCPU
The term VCPU is short for “virtual CPU”; i.e. a virtual processor for a virtual machine. A virtual machine can have a number of VCPUs. At first glance, a VCPU is like a “core” on a typical modern desktop or laptop computer. The difference is that a VCPU can actual represent a fractional share of a core on the physical machine; see below.
What are VCPU hours?
VCPU hours is an OpenStack measure of the resources that your instances are using. The VCPU hours measure for an instance is calculated as:
“number VCPUs” / “over-commit ratio” x “lifetime of instance”
where the lifetime of an instance starts when the instance is created, and ends when it is terminated.
Note that the VCPU hours measure includes time when an instance is paused or shutdown. That is because when an instance exists on a compute node in a paused or shutdown state, it stops other instances being launched on that node. Nectar OpenStack is designed on the assumption that you should be able to unsuspend or power up an instance at any time, and have the “level of service” that you expect. (This is equivalent to Amazon EC2 Reserved Instances.)
What is a Flavor?
A flavor is OpenStack terminology for a “virtual hardware template”. It specifies the dimensions of an Instance; e.g. the number of VCPUs, amount of memory and the size of the local file systems. The standard Nectar flavors are as follows:
|Name||VCPUs||Memory||Primary disk||Ephemeral disk||Notes|
|m2.tiny||1||768Mb||5Gb||0Gb||1, 2, 3|
- The m2 flavors may be subject to node-specific overcommit.
- The m2.xsmall and m2.tiny flavors are for small footprint webservers, and will typically have 2x and 4x CPU scaling relative to other flavors.
- The primary disk on the m2.tiny flavor is too small for some of the standard Nectar images.
An addition, there are various private and/or node-specific flavors, that are made available to selected tenants.
What is an overcommit?
The economics of cloud computing are based on the observation that most servers that run applications are underutilised. This means that you can often “commit” more resources to virtual machines than are available as physical resources on the computer hardware. This is known as “overcommit”.
In the Nectar OpenStack world, there are 3 primary resources that are committed to virtual machines:
- Processor cores can be shared (as VCPUs). Overcommit is typically implemented by time slicing, so that each virtual machine gets a fair share according to its CPU scaling relative to other active VMs.
- Memory can be shared using the system’s virtual memory hardware. Memory is divided into pages. Resident virtual memory pages are mapped to physical memory pages, and non-resident pages are stored on disk. If an application on a virtual machine tries to access a virtual memory page that is not currently resident, the hypervisor fetches the page from disk.
- Disk overcommit only makes sense if virtual machines don’t write to some of the disk space that has been allocated to them. Effectively the VMs are sharing empty disk blocks.
What is an Availability Zone?
An Availability Zone (or AZ) is an OpenStack term for a collection of physical compute and storage resources that is managed as a “cell”. For example:
The “QRIScloud” availability zone contains the Nectar resources in the Polaris Data Centre in Brisbane, managed by QCIF.
The eResearch SA availability zone contains the Nectar resources in the University of South Australia data centre.
What is an Image?
An Image is an OpenStack term for a bootable disk image that can be used to create a virtual machine. An image will typically include a Linux operating system kernel, libraries, utilities and configurations. It may also include other software and/or data.
Images are managed using the OpenStack Glance service. They are typically created by taking a snapshot of an existing virtual machine.
What Images are available?
You can see what public images are available by opening the Nectar Dashboard in a web browser and looking at the “Images” panel. You will see tabs for 4 categories of image:
- Nectar Official images
- Project images
- Images that have been shared with your project.
- Public images.
The Nectar Official images are (typically) installs of various Linux distributions, these images are refreshed regularly by the Nectar Core Services team to incorporate the latest patches. They typically consist of a “headless” server install, with a small amount of Nectar specific tailoring. (For example, the SSH daemon is configured in a particular way, and fail2ban is installed to fend off hackers.)
What Operating Systems are available?
Nectar Cloud provides images for various releases of the following common Linux distributions: Ubuntu, Debian, CentOS, Scientific Linux, Fedora and OpenSuse.
Only versions that are still maintained by the supplier are made available as Nectar images. (When an OS version goes off-maintenance, it will no longer receive timely security patches. You would be advised to upgrade.)
What Linux images are recommended?
The best choice depends what you are trying to do. For instance.
- If you want an instance that you can use for a long time, or that that you want to use to run a server, you might choose a distribution with a long support lifetime.
- If you want to use the latest versions of standard Linux applications, you might pick a distribution with a short release cycle.
- Your choice may be constrained by the domain applications you need to run.
- It may be down to individual preference; e.g. based on what they are most familiar with.
- Ubuntu’s primary focus is to provide Linux for end users.
- Ubuntu LTS (long term support) has major releases roughly every 2 years, and minor releases every 6 months to consolidate the accumulated patches.
- Debian is a “open source purist” distro. No proprietary binary drivers, etc.
- CentOS and Scientific Linux are derived from Red Hat Enterprise Linux (RHEL), and the primary focus on providing a stable platform for running services. Release cycles are tied to RHEL release cycles. Major releases every 3 to 4 years, minor releases every 6 to 12 months. End-of-life is 10 years from a major release.
- Fedora has a reputation for being a “bleeding edge” distribution. Fedora is supported by RedHat and serves as a proving ground for new developments before they go into RHEL.
- OpenSUSE is based on SUSE Linux by SUSE / Novell / Attachmate.
For Ubuntu and OpenSUSE check the distro name to see if is a “long term support” release.
All of the above Linux families push security patches in a timely fashion.
Introduction to cloud storage
What disk storage is available on an Instance?
Three kinds of disk storage are available to a Nectar instance:
- Local Storage: typically disk drives or solid state drives on the compute nodes.
- Volume Storage: typically implemented using Ceph-based file servers
- Object Storage: objects accessible via Swift or Amazon S3 APIs.
In addition, some collections can be NFS mounted on a Nectar instance.
What are the Primary and Ephemeral file systems?
The primary file system for a Nectar instance contains the instance’s operating system, application software and (typically) the user home directories. The ephemeral file system is mounted as “/mnt” by default, and provides instance-local disk space for general use.
When you launch a Nectar instance (from an image), primary and ephemeral disk space is allocated for the exclusive use of the instance. The sizes of these spaces is determined by the flavor that you launch, and file systems are built on them by the launch process. (The primary file system is initialised from the launch image, and the ephemeral file system is created as empty.)
The primary and ephemeral file systems are tied to the instance. Both will persist if an instance is shutdown and rebooted, but both will “go away” when the instance is terminated. The key difference between primary and ephemeral space is that the former can be preserved in an instance snapshot, but the latter cannot. That is the rationale for calling the later “ephemeral”.
(Beware that that ephemeral file systems for instances in the NCI AZ are currently implemented with a non-journalling file system, and are therefore vulnerable to significant file system corruption in the event of a power failure. Some users have lost data because of this.)
What is Volume Storage?
Volume storage in the OpenStack world is provided by the Cinder service. It consists of network accessible Volumes (virtual disks) that can be attached to an instance in the same availability zone, and used to host a file system.
- Volumes cannot be shared. A volume can only be attached to one instance at a time.
- Volumes can be snapshotted.
- Volumes have a lifetime that is independent of an instance.
- Your volume storage usage is subject to a separate quota. You need to request quota in a specific availability zone as part of your NeCTAR allocation request.
What is Object Storage?
Object Storage works differently to local storage, volume storage and NFS storage. Unlike those forms, Object Storage cannot be mounted as a regular file system. Instead the application uses HTTP / HTTPS to “get” and “put” the objects in the store.
Nectar Object Store is provided using the OpenStack Swift service. Nectar Swift is configured so that each object is replicated at 3 locations in the Nectar federation’s Swift cluster.
Nectar Object Storage is also accessible via Amazon S3 compatibility APIs.
Is Nectar storage backed up?
None of the forms of Nectar storage are backed up. Backups are the responsibility of the user.
Is Nectar Cloud storage safe?
Object Storage is relatively safe due to the fact that the data is (eventually) replicated to 3 (or more) locations. However, this replication does not occur immediately, and there have been situations where all complete replicas of an object are (temporarily) offline. In addition, there is no protection in scenarios where you mistakenly delete or overwrite objects.
In all other cases, it depends on how the respective Nectar nodes have implemented their data centres in general, and their storage platforms more specifically.
- All forms of storage associated with an instance should persist safely over a normal shutdown / restart of the instance, provided that the shutdown is performed cleanly. (However, as noted above, the instance’s primary and ephemeral file systems do not survive when the instance is terminated.)
- f the underlying storage platform has built-in redundancy (e.g. RAID, or Ceph replication) then there is a degree of protection against loss due to media failure; e.g. hard disk errors.
- If the file systems that run on the disk storage are “journalled”, then there is a degree of protection against file system damage due to an unclean shutdown; e.g. power failure.
However, there have been cases where instances have been terminated by accident, or file systems have been deleted or destroyed due to “operational issues”. It is therefore extremely unwise to assume that your data will always be safe. Certainly, neither Nectar or any of the Node operators can guarantee this.
Using the Nectar Cloud
How do I access the Nectar Cloud Dashboard?
The following steps should get you to your Nectar Cloud dashboard.
- Open https://dashboard.rc.nectar.org.au/ in your web browser.
- When you see the Australian Access Federation (AAF) login page:
- select your home institution (if required), and
- click “Login”
- select your home institution (if required), and
- At your home institution’s AAF login portal, enter your institutional login details, and click the login or sign on button (Details are institution specific.)
- If you see an AAF page asking for authorisation to release some details, authorise it.
You should now see the Nectar Cloud dashboard. If you have not been there before, we recommend you take time to explore the menus.
Why can't I see the quotas for my allocation?
The most likely reason is that you have the wrong Nectar Project selected. In the top banner of the Dashboard at the left end, there is a pull-down project selector. Make sure that you have your allocated project selected rather than your “PT” project.
If that doesn’t help, please contact our Support team. It is possible that there has been a problem with project provisioning.
What do I need to do before I start an Instance?
If you have either applied for (and been granted) an allocation, or you are using your “PT” project to launch, and you go straight to the Dashboard and launch an instance from one of the standard Nectar OS images, the launch will succeed, but you won’t be able to login to the new instance.
To avoid this, you need to do two things:
- Create or upload an SSH keypair to your Nectar account.
- Configure a Security Group to open the SSH port to allow you to connect from the outside.
What is an SSH keypair?
SSH stands for “Secure Shell”. It is the protocol that you use to get a regular terminal session to your instance.
The SSH software supports so-called “public key authentication” as one of the ways that the computer can “authenticate” you when you login. (In this context, the term “authenticate” means that the computer is checking that you are who you claim to be.) Public key authentication works using key pairs, consisting of a public key and a private key. The public key is something that you provide to other people or systems that are likely to want to authenticate you. The private key is a secret that you keep to yourself.
The basis of public key encryption is that the public and private key are mathematically related, and it is possible for the SSH software to prove that you hold the private key that corresponds yo your public key. This knowledge is deemed to be sufficient to authenticate you, just like your knowledge of your password is deemed to be sufficient to authenticate you on a conventional password-based system
The SSH configurations on a standard Nectar OS images are such that you must use public key authentication to login using SSH over a network connection. (Authentication using passwords would make your system too vulnerable to hacking by repeatedly trying to guess your password.)
What SSH client tool do I need?
You need an SSH client installed on your work computer in order to connect to a newly launched instance:
- On Windows, the recommend SSH client is Putty.
- On Mac and Linux, the recommended SSH client is the “ssh” command.
How do I create an SSH keypair?
The easy way to do it (for Nectar) is to generate a keypair using the Nectar Dashboard. Go to the “Access & Security > Key Pairs” tab, and then click “Create Key Pair”. A keypair will be generated and provided as a “pem” file that is suitable for use with the Windows “putty” command. You can also create the SSH keypair using putty, and upload the public key to the Dashboard.
On Linux and Mac, you can generate an SSH key using the “ssh-keygen” command.
What is a Security Group?
Network access from the outside to an OpenStack instance is controlled by a network firewall on the host that runs the instance. A Security Group is a container for a group of access rules that let specific kinds of network from specific places through the firewall. Each rule specifies:
- the direction; e.g. ingress or egress
- the network type; e.g. ethernet
- the protocol family; e.g. TCP, UDP or UCMP
- the port number
- the external IP address range (in CIDR notation).
CIDR notation consists of an 4-part IP address and a netmask size e.g. “n.n.n.n/n”.
- “18.104.22.168/24” means IP addresses from “22.214.171.124” to “126.96.36.199”.
- “188.8.131.52/32” means a single IP address (“184.108.40.206”)
- “0.0.0.0/0” means all IP addresses.
See Wikipedia for more details on CIDR notation.
How do I open the SSH port?
There is more than one way, but the simple way is to create an SSH access rule using the Nectar Dashboard is as follow
- Select the “Security & Access” panel.
- Select the “Security Groups” tab.
- Click “Create Security Group”.
- Fill in a security group name (e.g. “SSH”) and a description and click “Create”.
- Click “Manage Rules” for the newly created group.
- Click “Add Rule”.
- Select “SSH” in the Rule selector.
- The default CIDR is “0.0.0.0/0” … which allows access from all IPv4 addresses. If you want to restrict SSH access to specific places (a good idea!) then change the CIDR.
- Click “Add”.
Remember to associate the security group with the instance when you launch it.
Note that you can change the rules in a security group after the fact.
How do I start a NeCTAR Instance?
Using the Nectar Cloud dashboard:
- Select the “Instances” panel.
- Click on “Launch Instance” to start the launch wizard.
- In the launch wizard’s “Details” tab:
- type in an instance name
- select a flavor
- set the instance count to 1
- set the instance boot source to “boot from image”
- select an image
- In the launch wizard’s “Access & Security” tab:
- make sure that your keypair is selected
- select (at least) a security group that allows SSH access from your computer
- In the launch wizard’s “Availability Zone” tab:
- use the selector to choose the availability zone you want your instance to run in.
- Click “Launch”.
The instance should launch in a couple of minutes, and the Dashboard should update as the launch procedure progresses.
If you don’t select an availability zone, Nectar OpenStack will try to launch in the zone with the most free compute resources.
How do I login to the instance?
When the instance has launched, the Dashboard will display its IP address. To login to the instance, use the SSH client tool on your computer to connect to: “<login>@<ip-address>”, where “<login>” is:
- “ubuntu” for an Ubuntu instance
- “debian” for a Debian instance, or
- “ec2-user” for a Fedora, CentOS or Scientific Linux instance.
Note that your SSH client will need to the private key corresponding to the keypair you selected when launching the instance.
How do I setup my new instance?
You can now start installing application software so that you can use your instance to do useful work. However, before you start, we recommend that you do the following house-keeping:
- Use “yum” or “apt-get” to apply the latest security updates.
- Use “passwd” to set a password on your instance’s root account. This will allow you to login via the instance’s virtual console if its SSH or network configurations are damaged.
Is boot from volume a good idea?
Booting from a volume allows you to get around the problem that the primary file system size is limited. (Prior to the introduction of the M2 flavours, this was a problem for applications with a large installation footprint.)
However, there are some down-sides to booting from a volume.
- Instances booted from a volume can be problematic when you launch or terminate (due to OpenStack bugs).
- You cannot “nova rescue” a volume that has been booted from a volume. The rescue mechanism requires an image.
We recommend that “boot from volume” be avoided.
How long can I keep an Instance running?
In order for other users to take advantage of the resources, we encourage prompt termination of any instance not actively in use. However, there is no enforced time limit to running an Instance.
What should I do when I am finished?
When you are finished with an Instance, you should terminate it. Leaving an instance running, or in “paused”, “suspended” or “shutdown” states is tying down resources that other people could be using.
Note that terminating an instance destroys its primary and ephemeral file systems. If you want to save the primary file system (so that you can launch a new instance), it is advisable to take a snapshot of the instance before you terminate it.
How is Nectar Cloud usage accounted?
Nectar Instance usage accounting records the following:
- The number of VCPU-hours used. This is the number of VCPUs used multiplied by the time that the Instance is live, integrated over all Instances launched in a project.
- The number of GB-hours used. This is the number of GB of memory used multiplied by the time that the Instance is live, integrated over all Instances launched in a project.
In both cases, the lifetime starts when the Instance is launched and ends when it is terminated. An Instance that is in shutdown, paused or suspended states still counts as “live”.
How do I use Volume Storage?
There are 5 steps involved in using a Volume in a Project to provide file storage for an Instance in the project:
- Create the Volume
- Attach the Volume to an Instance
- Format the Volume
- Mount the Volume
- Set up the Volume to mount on reboot.
Note that Nectar Volume Storage has the following technical limitations:
- Volume storage quotas are (now) allocated within individual AZ’s, not federation-wide.
- A Volume cannot be attached to an Instance in a different AZ.
- A Volume cannot be shared with another Project.
- A Volume cannot be attached to two (or more) running Instances simultaneously.
How do I create a Volume?
Using the Nectar Cloud dashboard:
- Select the “Volumes” tab in the “Volumes” panel.
- Click “Create Volume”
- Fill in the following fields:
- A volume name
- A volume description
- Select “no source, empty volume” as the source
- Set the requested volume size in Gigabytes.
- Select the Availability Zone*
- Click “Create Volume”.
* Effective March 1st 2015.
Do not create new Volumes in QRIScloud until further notice. QRIScloud volume storage is overcommitted, and creation / use of new volumes increases the risk of SERIOUS operational problems.
What is a Snapshot?
A snapshot is (roughly speaking) a disk-block level copy of a file system.
An Instance snapshot is a copy of the primary file system of an Instance. Instance snapshots are Images (see above) and are held in the OpenStack image service (Glance. You can see them initially in the “Project Images” tab in the “Images” panel.
A Volume snapshot is a copy of a Volume. Volume snapshots are held in the OpenStack volume service (Cinder). You can see them in the “Volume Snapshots” tab in the “Volumes” panel.
Snapshots can be created via the Nectar Cloud dashboard:
- Creating an Instance snapshot is an “action” you can perform via the Instances panel.
- Creating a Volume snapshot is an “action” you can perform via the Volumes panel.
Under normal circumstances, creating an Instance snapshot should only take a few minutes. If a snapshot gets “stuck”:
- Delete it and try again.
- Try shutting down your Instance first.
- If neither of those workarounds work, raise a support request.
There is a Virtual Wranglers article on Troubleshooting Instance Snapshots.
Do I need to shut down before creating a Snapshot?
We strongly recommend that you shut down your instances before snapshotting, as:
- Instance snapshots on Nectar don’t include ephemeral file systems, or attached volumes.
- You cannot “resume” a Nectar Instance snapshot. Instead you launch a new Instance, and that new Instance won’t be able to use any saved memory state.
You will get the most reliable snapshots if you take them while the system is “quiesced”, and shutting down the Instance is the only reliable way to do that on Nectar OpenStack. If you take a snapshot of an active system, you can run into the following problems:
- If you catch the system at the wrong instant, file system updates may be in the journal, but not the main file system. This will require a manual file system check (“fsck”) to recover.
- If an application (or database) is active, it could have been in the middle of updating an important file. Depending on how resilient the application is to unplanned interruptions, this can lead to data loss.
Taking snapshots of running systems is also more likely to run into operational issues.
What about Security?
Security of Nectar Cloud Instances is the responsibility of the user. However, if we see evidence that an Instance’s security has been comprised, we will take steps to shut down and isolate it so that it doesn’t do any further damage to the infrastructure or to other users’ assets.
When should I apply system patches?
If you are applying patches by hand, we recommend that you do it at least once a week.
Alternatively, it is easy to configure a recent Linux system to apply patches automatically:
The “yum” instructions also discuss the pros and cons of applying patches automatically, and some alternatives.
Finally, we recommend that you apply all patches, not just security patches.
If system patches are discontinued, you should update to a more recent version of the operating system. You should treat this as a matter of urgency.
How do I configure the firewall?
The easy way to configure your Instance’s firewalling it is to use Security Groups (see above). You can also configure additional firewalling inside your virtual machine. (You might do this if you were worried about the external firewalling not working.)
What happens if my instance gets hacked?
The standard Nectar-wide procedure for dealing with a hacked Instance is:
- Immediate suspension of the Instance, locking it to prevent restarting.
- User notification that the instance has been compromised, and suspended.
- Provision of a a link for downloading the primary image and (if requested) the ephemeral disc.
- Seven days later, Termination the Instance.
Under no circumstances will we remove the lock to allow you to restart the Instance. Once an Instance has been compromised, there is no way to guarantee that the hackers have not installed hidden methods to get back into the system. The only safe option is to build a new Instance from a clean image, and make sure that the security holes that allowed the compromise have been closed.
We generally don’t have the resources to help you definitively determine how your Instance was hacked.
What about backup?
Backup of your Nectar instances and data stored on Nectar storage is entirely your responsibility. We advise that you implement regular automated backups, monitor that they are working, store backups in a safe place, and implement and test your own disaster recovery procedures.
We regularly hear about users who have neglected to implement backups, or implemented a scheme that is manifestly inadequate. Typically, the first we hear is when users are already in trouble. Sometimes we can help. Often we can’t, and the user’s data is lost for ever.
How do I implement backups?
There is a Virtual Wranglers page on the topic implementing backups for Nectar virtual machines. There is no “one size fits all” recommendation.
To check that your backup is adequate, the best way to be sure is to test it.
- Pretend that you have lost some files and try to recover them from the backups.
- Pretend that you have lost an entire instance, and try to reconstruct a fresh Instance from the backups.
- Pretend that you have lost your on-site backup as well. and try to restore from your off-site copy (if you have one).
Are Snapshots a good way to implement backups?
Snapshots are a simple way to record a previous state of a system, however, they are a cumbersome and inefficient way to implement backups.
- Each snapshot is a copy of everything, and that takes a lot of storage. By contrast, a typical backup system records a complete copy of the file system state, and a sequence of incremental snapshots that represent only the files that have changed.
- Instance and Volume snapshots are kept on disc. In the case of Volume snapshots, they will be on the same storage cluster as the Volume that you are backing up.
- Finding and selectively retrieving old files from snapshots is labour intensive, especially in the case of Instance snapshots.