VERSION: v2.0
The GCLI is a python library to interact with the Lancium compute backend to create and control virtual machine and container instances.
A Lancium account on the Lancium portal is required to use the GCLI. The same credentials that you set on the portal are used to authenticate with the GCLI.
Download: this script. This script grabs and installs the latest version of the GCLI and the dependencies.
After the download completes, run:
bash gcli_express_installer.sh
There are action items that will be presented to you regarding adjusting your search path to include the directories created with the installer.
You can verify that you have installed everything correctly by runningt the following commands.
gcli version && grid version
Which will return the following:
GCLI Version: 2.0
Genesis II version 2.7.647 Build 9922
To uninstall the GCLI, navigate to the install directory and run:
./uninstall.sh
This will delete the source directory.
To uninstall GenesisII as well, navigate to $HOME/GenesisII/ and run the following:
./uninstall.sh -c
Note: This will still leave some user data around. To fully destory all GenesisII files run the following:
rm -r GenesisII .GenesisII
Depending on which grid commands you used, you may also need to run:
rm -r .genii_ui_persistence
rm -r .genesisII-2.0
Before using the GCLI, the grid clientserver must be running on port 8888. To start the clientserver, run the following command:
gcli clientserver start
When the GCLI sends a request to the clientserver, it checks authentication and communicates over TLS with Lancium services to carry out the request (subject to access control.) As long as the
clientserver
is up, the GCLI will be able to communicate with the Lancium Compute Infrastructure. If it goes down, no communication is possible and the clientserver
must be restarted and you must re-authenticate.
clientserver
takes a port to use as an argument and by default the GCLI uses port 8888
.
Before using the GCLI, you must authenticate.
If you haven’t already, you must create a Lancium account on the Lancium portal before you can proceed.
After you have created your account, you can authenticate with the gcli with the same credentials as the portal by running:
gcli authenticate <PORTAL USERNAME/EMAIL> <PORTAL PASSWORD>
You can verify that you have logged in by running:
gcli whoami
If the clientserver closes you must reauthenticate.
The GCLI package comes with a few important files. Besides changelog.txt
, you must keep all of these files and maintain them in the same directory.
CPUs refers to the number of virtual cores (or vCPUs) not physical cores. You can use the --cpus
flag to change the number of virtual cores available for your job.
Memory refers to the amount of system memory that will be made available to you inside your container or VM. The value can be passed in as either megabytes or gigbytes by using the GB or MB suffixes (example --memory 4GB
or --memory 4096MB
). You can use the --memory
flag to change this value.
GPU refers to a full graphics card, not a single GPU die on the board. You can use the --gpu
flag to change this value. Currently, we only offer Nvidia K40s and Nvidia K80s. To request those cards, indicate --gpu k40
and --gpu k80
respectively.
GPU Count refers to the number of graphics card will be made available to you inside your container or VM. You can use the --gpuCount
flag to change the number of graphics cards value for your job. We currently allow a maximum of 4 GPUs per job.
GPU Memory refers to the amount (in GB) of GPU memory that will be made available to you inside your container or VM. You can use the --gpuMemory
flag to change this value.
Using the --scratch
flag, you can request that additional space be added onto the base image. For example, the provided Lancium base image has a virtual disk size of 5GB. To get an additional 5GB of usuable scratch on the image, you can provide the --scratch 5
flag when creating the virtual machine.
A shape is a predefined job resource configuration saved in resources/shapes.yaml. You are encouraged to modify the properties of the shapes.yaml
file to best suit your needs. You can both modify existing shapes and create your own shapes. Shapes do not need to have all fields present. Any missing field will assume a default value, shown in the table below.
Field | Default |
---|---|
CPUs | 1 core |
Memory | 4096MB |
GPU | None |
GPU Count | 1 GPU |
GPU Memory | 12 GB |
Scratch space | 0 GB |
clientserver
will start, stop, or restart a GenesisII clientserver on a port given via the -p
flag. The clientserver must be running in order for the GCLI to work.
Note: If you decide to a port number other than 8888, you must set an environment variable LANCIUM_CLIENT_SERVER
to localhost:<PORT_NUMBER>
.
usage: gcli.py authenticate [-h] [-p PORT] action
positional arguments:
action 'start', 'restart', 'stop', or 'status'.
optional arguments:
-h, --help show this help message and exit
-p PORT, --port PORT Port to run your clientserver. (Default: 8888)
startVMs
configures and submits VM jobs to the Lancium Compute backend.
usage: gcli.py startVMs [-h] [-s SHAPE] [-c CPUs] [-m MEMORY] [--gpu GPU] [--gpuCount GPUCOUNT]
[--gpuMemory GPUMEMORY] [--time TIME] [--startIndex STARTINDEX]
[--scratch SCRATCH | --tmp TMP]
imageName jobName count
positional arguments:
imageName Image of VM(s) to start
jobName Name of job(s)
count number of VM(s) to start
optional arguments:
-h, --help show this help message and exit
-s SHAPE, --shape SHAPE
xsmall, small, medium, or large VM configuration presets
-c CPUs, --cpus CPUs
number of vcpus for VM(s). (Default: 1) Will override shape preset
-m MEMORY, --memory MEMORY
memory for VM(s) in either GB or MB. You must provide a suffix. (Default: 8 GB) Will override shape preset
--gpu GPU GPU type: K40, K80, or GTX 1070. (Default: None) Will override shape preset
--gpuCount GPUCOUNT Number of GPUs to include. (Default: 0) Will override shape preset
--gpuMemory GPUMEMORY
Amount of memory per GPU in GB. (Default: 12) Will override shape preset
--time TIME Amount of wallclock time in minutes to run your job. (Default: Run until completion
or until terminated by queuing system.)
--startIndex STARTINDEX
Starting suffix when count > 1. Defaults to 1. (Default: 1)
--scratch SCRATCH Amount of additional scratch space added to base image in GB. (Default: 0 GB)
--tmp TMP DEPRECATED: Use --scratch. Amount of additional scratch space added to base image
in GB. (Default: 0 GB)
startContainers
configures and submits Singularity container jobs to the Lancium Compute backend.
usage: gcli.py startVMs [-h] [-s SHAPE] [-c CPUs] [-m MEMORY] [--gpu GPU] [--gpuCount GPUCOUNT]
[--gpuMemory GPUMEMORY] [--startIndex STARTINDEX] [--scratch SCRATCH | --tmp TMP]
[--time TIME] [-i INPUT INPUT] [-I GRIDINPUT GRIDINPUT] [--storage STORAGE STORAGE]
[-a ARCHIVE ARCHIVE] [-o OUTPUT OUTPUT]
imageName jobName count command
positional arguments:
imageName Image of VM(s) to start
jobName Name of job(s)
count number of VM(s) to start
command Quoated string indicating what to run inside singularity container.
optional arguments:
-h, --help show this help message and exit
-s SHAPE, --shape SHAPE
xsmall, small, medium, or large VM configuration presets
-c CPUs, --cpus CPUs
number of vcpus for VM(s). (Default: 1) Will override shape preset
-m MEMORY, --memory MEMORY
memory for VM(s) in either GB or MB. You must provide a suffix. (Default: 8 GB) Will override shape preset
--gpu GPU GPU type: K40, K80, or GTX 1070. (Default: None) Will override shape preset
--gpuCount GPUCOUNT Number of GPUs to include. (Default: 0) Will override shape preset
--gpuMemory GPUMEMORY
Amount of memory per GPU in GB. (Default: 12) Will override shape preset
--startIndex STARTINDEX
Starting suffix when count > 1. Defaults to 1. (Default: 1)
--scratch SCRATCH Amount of additional scratch space added to base image in GB. (Default: 0 GB)
--tmp TMP DEPRECATED: Use --scratch. Amount of additional scratch space added to base image
in GB. (Default: 0 GB)
--time TIME Amount of wallclock time in minutes to run your job. (Default: Run until completion
or until terminated by queuing system.)
-i INPUT INPUT, --input INPUT INPUT
Copy a file or directory from your local file system into the job's working
directory. Takes 2 arguments, the first is local path (relative or full) to
file/directory, the second is the name you want in the job working directory.
-I GRIDINPUT GRIDINPUT, --gridinput GRIDINPUT GRIDINPUT
Copy a file or directory from the grid into the job's working directory. Takes 2
arguments, the first is full grid path to file/directory, the second is the name
you want in the job working directory.
--storage STORAGE STORAGE
Copy a file or directory from the storage space into the job's working directory.
Takes 2 arguments, the first is local path (relative or full) to file/directory,
the second is the name you want in the job working directory.
-a ARCHIVE ARCHIVE, --archive ARCHIVE ARCHIVE
Copy an archive file (.tar, .zip, .gz) and extract into job working directory.
Takes two arguments, the first is the local path (relative or full) to archive
file, and the second is the name you want in the job working directory.
-o OUTPUT OUTPUT, --output OUTPUT OUTPUT
Output a file or directory back to your local file system. Takes two arguments, the
first is the path in the local file system to writeback, the second is the filename
of the output file or directory in the job working directory.
complete
stages back data that was indicated in startContainers (only applies to Container jobs, not VMs) and deletes the job record.
usage: gcli.py complete [-h] tickets
positional arguments:
tickets Comma-seperated list of tickets, or 'all' if you want to complete all jobs
optional arguments:
-h, --help show this help message and exit
listVMs
outputs information about your Lancium job records.
usage: gcli.py listVMs [-h] [-a] [-d]
optional arguments:
-h, --help show this help message and exit
-a, --all Prints out all VMs included those in FINISHED state
-d, --detail Prints out full information about the VMs listed
vmStatus
outputs state and other information information about you Lancium jobs indicated with a job name.
usage: gcli.py vmStatus [-h] jobList
positional arguments:
jobList Comma-seperated list of job names
optional arguments:
-h, --help show this help message and exit
vmTerminate
immediately halts all job execution. No data will be staged out after termination via vmTerminate
.
usage: gcli.py vmTerminate [-h] ticketList
positional arguments:
ticketList Comma-seperated list of tickets
optional arguments:
-h, --help show this help message and exit
uploadImage
is the tool used for uploading your images to the Lancium Compute backend. Note, only .simg, .sif, and .qcow2 files are accepted.
usage: gcli.py uploadImage [-h] imageName localPath
positional arguments:
imageName What to name the image in the grid
localPath Local path to .qcow2 or .simg/.sif file
optional arguments:
-h, --help show this help message and exit
listImages
provides information about images hosted on the Lancium Compute backend, including your images and Lancium’s images.
usage: gcli.py listImages [-h] [--vm] [--singularity] [--size] [--lancium]
optional arguments:
-h, --help show this help message and exit
--vm Only list VM images
--singularity Only list singularity images
--size Include size (in bytes) of each image listed
--lancium List Lancium's images
rmImage
is the tool used for deleting your images from Lancium Compute backend.
usage: gcli.py rmImage [-h] imageName
positional arguments:
imageName Name of image to be removed
optional arguments:
-h, --help show this help message and exit
uploadToStorage
is the tool used for uploading your data into persistent Lancium storage.
usage: gcli.py uploadToStorage [-h] [-f] localPath gridPath
positional arguments:
localPath Local path to file or directory to upload
gridPath Path in grid relative to STORAGE directory
optional arguments:
-h, --help show this help message and exit
-f, --force Upload a file even if it overwrites an existing file.
listStorage
provides information about your data uploaded to Lancium’s persistent storage backend.
usage: gcli.py listStorage [-h] [-p PATH]
optional arguments:
-h, --help show this help message and exit
-p PATH, --path PATH Path in grid relative to STORAGE directory
rmStorage
is the tool used for removing your data from persistent Lancium storage.
usage: gcli.py rmStorage [-h] path
positional arguments:
path Path in grid relative to STORAGE directory
optional arguments:
-h, --help show this help message and exit
authenticate
is the tool used for authenticating with the GCLI.
usage: gcli.py authenticate [-h] username password
positional arguments:
username Lancium account username
password Lancium account password
optional arguments:
-h, --help show this help message and exit
whoami
provides information about the credentials with which you’re logged in. It is useful for verifying successful authentication.
usage: gcli.py whoami
logout
will log you out of the GCLI. You will be unable to use the GCLI until you have reauthenticated.
usage: gcli.py logout
version
will provide information regarding the version of the GCLI that is installed.
usage: gcli.py version
To create a virtual machine, you must use or modify a provided base image. Currently we provide a Ubuntu Server 18.04 LTS base image. This image is modified to have a user account lancium
with password fcz83#FZ\%rsQcVNe
and is configured to interact with our backend. From this image, you can modify it as required with local tools such as virt-install
or Virtual Machine Manager
. Once you are done modifying this base image, you will upload it to be used through the CLI described below.
WARNING: Using another base image will lead to issues such as failure to report an IP address, partitions not being resized, and other issues.
Before using the GCLI, you must have a clientserver
running as described above. Then, you must use authenticate with the Lancium backend using the authenticate
command before using most of the commands.
gcli authenticate tester@lancium.com testPass
Now, you will remain authenticated as long as the clientserver
remains running. To check who you’re currently authenticated as run gcli whoami
which will output something similar to:
Client Tool Identity:
(CONNECTION) "Client Cert F9535C75-2D19-751A-F28E-720B8FB4CF87"
Additional Credentials:
(USER) "tester@lancium.com" -> (CONNECTION) "Client Cert F9535C75-2D19-751A-F28E-720B8FB4CF87"
(GROUP) "Lancium-users" -> (CONNECTION) "Client Cert F9535C75-2D19-751A-F28E-720B8FB4CF87"
Now, we want to upload the custom image that you’ve built.
Before running a VM or Singularity job, an image is required. This image must exist in the grid. The image you provide acts primarly as a virtual environment for your running job. You are encouraged to build your own image(s) made specifically for the kinds of jobs you would like to run.. In addition, we have our own collection of prebuilt images that any Lancium user is free to use.
In the following sections, we discuss how to manage your images in the grid.
To upload an image to the grid, use the uploadImage command. uploadImage takes two arguments: the path to your image in your local filesystem, and the name you would like in the grid. It’s important that all uploaded images can only have the extensions .simg, .sif, or .qcow2. The image in your local file path and the name in the grid must have one of the three allowed extensions and the extensions must match. If the extensions do not match or you are attempting to upload a file with a disallowed file extension, your upload will fail.
Example:
gcli uploadImage amber18-cuda9.2.simg my_amber.simg
Will output:
Exit Code: 0
Note: uploadImage will initiate the image upload. A successful exit does not mean that image has finished uploading, only that it has started.
To see what images already exist in the grid, you can use gcli listImages.
Example:
gcli listImages
Will output:
L_amber18-cuda9.2.simg
amber18-cuda9.2.simg
test.qcow2
test.simg
You can choose to only show VM images or Singularity images by using the –vm and –singularity flags respectively.
Example:
gcli listImages --singularity
Will output:
L_amber18-cuda9.2.simg
amber18-cuda9.2.simg
test.simg
Additionally, you can see the size of each image in bytes with the –size flag.
gcli listImages --size
Will output:
L_amber18-cuda9.2.simg 5073121280
amber18-cuda9.2.simg 5105672223
test.qcow2 1902641152
test.simg 3282227200
Note: You should verify that these file sizes match with the local copy before attempting to boot the image. You can check the size of your image in your local filesystem by running:
du -b your_image.qcow2
You can list Lancium’s images using the –lancium flag.
gcli listImages --lancium
Will output:
L_amber18-cuda9.2.simg
QuantumEspresso.simg
gromacs.simg
lancium-gpu-18.04.qcow2
lancium-ubuntu-18.04.qcow2
py2_caffe.simg
py2_pytorch.simg
py2_tensorflow.simg
py2_theano.simg
py3_caffe.simg
py3_pytorch.simg
py3_tensorflow.simg
ubuntu.simg
ubuntu18.04.simg
ubuntu18.04_cuda9.2.simg
To delete an image you can run gcli rmImage. rmImage takes only one argument, the name of the image in the grid you wish to delete.
WARNING: Image deletion is irreversible.
Example:
gcli rmImage your_image.qcow2
Will output:
Exit Code: 0
Once the image you want to use has been fully uploaded, you can start it.
gcli startVMs image_name.qcow2 VM_NAME 1 -c 2 -m 8092MB
This will start the image with hostname VM_NAME
, 2 vCPUS, and 8092MB of memory. Note that we get the VM ticket as output of the startVMs
command. This ticket is used to control the VM instance. To check on the status of this image, we can run the command: gcli listVMs -d
(-d
asks for detailed information on the VMs). This may output:
{
"VM_NAME": {
"ticket": "3A89D01C-29CB-53C7-4EF9-701E3CD82DFE",
"time": "15:31 EDT 12 May 2020",
"tries": "0",
"state": "Booting",
"ipaddr": "Booting"
}
}
If we wait for the VM to fully boot, we will get something like:
{
"VM_NAME": {
"ticket": "3A89D01C-29CB-53C7-4EF9-701E3CD82DFE",
"time": "15:31 EDT 12 May 2020",
"tries": "0",
"state": "Running",
"ipaddr": "10.3.250.125"
}
}
From this output, we get an IP address that we can SSH onto from other VMs. Currently, this IP address is non-routable and can only be used for intra-Lancium communication. Once you are done using the VM, you can shut it down and the instance will clean up behind it. Otherwise, you can ask the VM to terminate with its ticket:
gcli vmTerminate 3A89D01C-29CB-53C7-4EF9-701E3CD82DFE
Some other helpful examples:
gcli vmStatus VM_NAME
This will print the same output as listVMs -d
except only for the VM named.
gcli rmImage image_name.qcow2
This will delete the named VM image on the Lancium backend.
gcli logout
This will logout of all authenticated users with the clientserver
.
gcli version
This will print out the current version of the GCLI package.
To start a singularity job, we require the –singularity flag to be set. This flag takes a quoted command-line argument that we will run inside the container. Example:
gcli startVMs image.simg new_job 1 -i /home/user/input.txt input.txt --singularity "cat input.txt"
Certain characters are not allowed to appear in the –singularity argument. These characters are “, ‘, and ;.
WARNING: Uploading large files as input to your job or into storage can take a long time, often giving the impression that the GCLI is hanging. Unless you get a message indicating an error has occurred, your upload is proceeding normally and should not be interrupted.
There are multiple ways to get files and directories into your job working directory: input from your local file system, archives, and from storage.
To input files from your local file system, use the -i or –input flag when running startVMs.
The -i flag takes two inputs, the relative or absolute path to the file or directory and the name in the job working directory.
Example:
gcli startVMs image.qcow2 new_job 1 -i /home/user/input.txt input.txt
gcli startVMs ... -i ../relative/data_sets data
To input archive files from your local file system, use the -a or –archive flag when running startVMs.
The -a flag takes two inputs, the relative or absolute path to the file or directory and the name in the job working directory. Files uploaded with this flag will extracted in the job working directory automatically.
Example:
gcli startVMs image.simg new_job 1 -a ../path/to/archive.tar archive.tar --singularity "ls"
To input files from your grid storage, use the –storage flag when running startVMs.
The –storage flag takes two inputs, the relative path to the file or directory in the grid. The base directory is /home/CCC/Lancium/<USERNAME>/STORAGE/ and the name in the job working directory.
Example:
gcli startVMs ... --storage path/in/grid/input.txt input.txt
But how do I get my large data sets into my storage directory? Please see the “Storage” section.
Output is specified during submission, not completion. You specify output with the -o or –output flag. The -o flag takes two arguments: the local path where to stageout data, and the file or directory name in the job working directory that will be staged out.
NOTE: If the local path that you’re trying to stage data out to does not exist during submission, you will only get a warning and the job will still be submitted (assuming there are no other faults).
Standard output will always be called stdout.txt Standard error will always be called stderr.txt
If you want to stageout standard output and standard error:
gcli ... -o ../path/to/save/stdout.txt stdout.txt
gcli ... -o ../path/to/save/different_std_err_name.txt stderr.txt
Additional file and directory examples:
gcli ... -o ../path/to/save/new_file.txt file_in_JWD.txt
gcli ... -o ../path/to/save/directory directory_in_JWD
Now that output locations have been specified, you must wait for the job to complete before staging out any data. You can terminate your job early using:
gcli vmTerminate <ticket>
When your job has completed, you can stageout the data and clean up the job using complete:
gcli complete <JOB_ID>
complete will download data from the JWD according to what was specified at submission.
If the directory structure required to stageout files does not exist in your local file system, you will get an error and job completion will terminate.
Ex: If you’re trying to stageout stdout.txt to /path/to/stdout.txt, and /path/to does not exist, stdout.txt will not be staged out.
If files you specified at submission time do not exist in the JWD when running complete, they will be skipped.
If all files are stageout or skipped, the job will be cleaned up.
Files can be uploaded to persistent storage location in the grid to be used in job execution. To upload to storage you can use the uploadToStorage, which takes two arguments: the path to the file or directory in your local file system, and the relative path in the grid that you’d like to keep the file. If the relative path does not exist, we will create it.
For example:
gcli uploadToStorage ../local_path_to/data_set_directory data_sets/data_set_1
To use files uploaded to storage in your jobs, please see section “Input to Container Jobs”.
To run a simple ‘hello world’ Singularity job, with output staged back:
$ gcli startContainers Lancium/ubuntu18.04.simg MyJob 1 "echo hello world" -o out stdout.txt
Will output:
{
"MyJob": {
"ticket": "F52E470E-DE19-3F4A-778D-D88156EE136D"
}
}
To check the status of your job:
$ gcli vmStatus MyJob
Will output:
{
"MyJob": {
"ticket": "F52E470E-DE19-3F4A-778D-D88156EE136D",
"time": "14:00 EDT 15 Jun 2020",
"tries": "0",
"state": "Running",
"ipaddr": "None, IPs only assigned for VM jobs"
}
}
After it has terminated, to get your output back:
$ gcli complete F52E470E-DE19-3F4A-778D-D88156EE136D
Will output:
INFO: Copied output file/directory: stdout.txt to location: /home/charlie/amber_test/out.
Exit Code: 0
To verify your output file contains what you expect:
$ cat out
hello world
This likely occured due to an bad state that can arise during the installation of the GCLI, especially if an existing version of GenesisII existed before installing the GCLI.
To fix this issue you will need to logout of the GCLI, delete the all user data files that GenesisII stores, then reinstall the GCLI.
Logout of the GCLI:
$ gcli logout
Delete all user data stored by GenesisII:
$ bash ~/GenesisII/uninstall && rm -r ~/GenesisII ~/.GenesisII/ ~/.genii-ui-persistence
Note: .genii-ui-persistence may not exist if you never used the GenesisII UI.
Download the GCLI installer if you don’t have it already
Run the installer:
$ bash gcli_express_installer.sh
Please email support@lancium.com detailing the problem and we will work with you to resolve it.