Community Shell
From TeraGrid Wiki
Contents |
Background
The purpose of this document is to show how to install and configure the Community Shell for Science Gateways using Community User Accounts and Community Software Areas. The intended audience consists of both system administrators and community software developers. Instructions specific to each type of user are given in separate sections below. But first, a little explanation as to the need for the Community Shell.
A growing trend in supercomputing is the use of community gateways that provide supercomputing resources to a wide audience. These gateways maintain a user database so that each user can log in to an individual account at the gateway. However, the community of users may share a single community account credential which is utilized when performing computations on the supercomputing resources. Thus, while the gateway/portal may be able to track individuals, the supercomputing resources can only track the single community account user. Since all gateway users (theoretically) have access to the community account credential, a gateway user could conceivably execute unauthorized code on a supercomputing resource and do so with a high degree of anonymity, which may result in all the users of that gateway losing access to the resource. The goal of the Community Shell project is to mitigate this potential for abuse by placing restrictions on the applications which may be executed by a community account.
The proposed solution requires a collaborative effort between the system administrators of the supercomputing resources and the community developers of the gateway. In this scenario, a community developer is responsible for requesting resources used by the gateway, in particular a Community Software Area and a Community Account. The community developer also decides the binary applications that will be executed on the supercomputing resources. These binaries are placed in a Community Software Area directory so as to be freely accessible by the community developers (e.g. to install a newer software version). The Community Account user's shell is set to commsh and configured to limit what can be executed, specifically only the binaries in the Community Software Area directory. This simple configuration scheme minimizes what can be done with the Community Account user's credential and also eases configuration for both system administrators and community developers.
In a more advanced/secure configuration, the binary files in the Community Software Area directory are not called directly by the gateway. Rather, a set of static (i.e. unchanging) scripts is written by the community developer. These scripts are placed in a protected location on the supercomputing resources with the restriction that they are run with security provided by the commsh Community Shell (i.e. commsh restricts the community account to run only the approved scripts). The scripts then call the binaries which are placed in a directory accessible by community developers. This allows for the binaries to be updated as needed (i.e., to upgrade to new software package versions) without involvement of the system administrator, while keeping the security restrictions placed on the scripts (so that only a small set of pre-approved binaries can be executed). The system administrator is responsible for approving the scripts and installing them in an appropriate directory.
Below you will find configuration details for both the simple and advanced configuration schemes.
Abbreviations
In the discussion that follows, we will use the following abbreviations for the sake of brevity when specifying directory/file paths.
- SA = System Administrator - This user is responsible for the creation of the Community Software Area and the Community Account, as well as the installation and configuration of the Community Shell executable and associated configuration files.
- CSA = Community Software Area - This is an allocation of disk space available at any TeraGrid site for the installation of executables and libraries that will be utilized by a community of users. A request must be made for the creation of a CSA.
- CD = Community Account Developer - This user is responsible for requesting the creation of a CSA and associated Community Account for their community of gateway users. Additionally, the CD may design scripts which call binary executables that do the actual computation on the supercomputing resources. There may be more than one CD. Each CD has his or her own individual login to TeraGrid systems.
- CU = Community Account User - When a CD requests the creation of a Community Account, a new user and associated credential is created. The CU's credential is utilized by the gateway to run programs on supercomputing resources on behalf of the gateway users. So while there may be many user accounts at a science gateway, there is a single account on the supercomputing resources for the CU.
- CG = Community Account Group - When a CD requests the creation of a CSA, a Community Group is created, in the sense of a "Un*x group". The CG is named the same as the CSA. CG members consist of users who were listed as CDs when the creation of the CSA was requested. The CG allows CDs to update binary applications stored in the CSA directory.
Proposed Configuration
Directories / Files
This is the proposed layout for the various directories and files utilized by commsh, a CSA, and a CU. Details will be given in the sections that follow. Here we assume that the CSA request was made by "jzsmith", who is the primary CD. The CSA's name is "ntroport". The name of the CU is "ntrouser". While the CSA name can be different from the CU name (as shown here), we suggest that CDs try to make them the same. We have chosen them to be different in this example to illustrate the ownership of files and directories (i.e. uid and gid).
Ownership Perm Directory / File Usage --------- ---- ---------------- ----- root:root 0755 /usr/local/bin/commsh Location of commsh binary [1] root:root 0644 /etc/commsh.conf Configuration file for commsh [2] root:root 0755 /etc/commsh.d Directory for per-community configurations [3] root:root 0755 /etc/commsh.d/ntroport Location of config file and optional scripts for CSA [4] ntrouser:ntroport 2770 ~ntrouser CU's home directory [5] jzsmith:ntroport 2775 $TG_COMMUNITY/ntroport Location of CSA binaries and associated files [6]
Explanatory Notes
- The default configuration of
commshinstalls the binary into/usr/local/bin. While this is what we will use in the proposed configuration, you can change this with the "--prefix=..." option when runningconfigure. - When
commshis first installed, a sample configuration file is installed in/etc/commsh.conf.sample. The SA can look at this file for some example uses ofcommsh. Ultimately, the SA must create/etc/commsh.confeither by copying the sample configuration file and editing it, or by creating/etc/commsh.conffrom scratch. If you wantcommshto read the configuration file from some place other than/etc/commsh.conf, you can set the "--sysconfdir=..." option when runningconfigure. - This directory is the storage location for all CSA configuration directories (as shown in [4] below). In other words, this directory should contain only directories, no files.
- The configuration files for
commshspecific to thentroportCSA/CU reside here. The/etc/commsh.conffile contains references to the various CSAs' configuration files. For example, the/etc/commsh.conffile will have entries like this, one for each CSA:# /etc/commsh.conf CheckUser ntrouser ReadUserConfig ntrouser /etc/commsh.d/ntroport/commsh.conf
This new configuration file
/etc/commsh.d/ntroport/commsh.confhas the entries for binaries (for a simple setup) in the CSA directory or scripts (for a complex setup) in the/etc/commsh.d/ntroportdirectory which will be called bycommsh. Here are two basic configuration files showing a simple configuration and an advanced configuration.The first configuration file shows a very simple setup where a CU is allowed to run any executable in the CSA
$HOME/bindirectory, using any number of command line parameters.# /etc/commsh.d/ntroport/commsh.conf # Simple configuration - direct access to CSA binaries # Allow the CU to run any command in the CSA bin directory with any parameters DirectAccess $TG_COMMUNITY/ntroport/bin/* **
The second example is more complex and involves a bit of indirection to deter unauthorized modification of allowable executables. In this setup, a CD has created two scripts with the intention that they not be modified. They are placed in the read-only
/etc/commsh.d/ntroportdirectory by the SA. Within these two scripts are calls to executables stored in the CSA$HOME/bindirectory. (Note that the scripts are highly specific to a particular CSA and are not shown here.) The configuration file below allows a CU to call only these two scripts with two command line parameters (for input and output). The scripts call executables in the CSA$HOME/bindirectory with the two command line parameters. The idea here is to give a CU a limited number of commands that can be called, but still allow a CD to update the underlying executables stored in the CSA$HOME/bindirectory.# /etc/commsh.d/ntroport/commsh.conf # Advanced configuration - indirect access to CSA binaries via protected scripts # Allow the CU to run two scripts in the protected directory with two parameters # script1 and script2 in turn call executables in $TG_COMMUNITY/ntroport/bin DirectAccess /etc/commsh.d/ntroport/script1 -input * -output * DirectAccess /etc/commsh.d/ntroport/script2 -input * -output *
- When the request for a CU has been approved, the CU's home directory is automatically created. In this case, we deliberately chose the CSA name to be different from the CU name. You will probably want to choose them to be the same. The CU home directory will be utilized by the various binary files for input and output since the binaries will be executed with the CU's credential via GRAM. In other words, all input files should be transferred to the CU's home directory, for example via GridFTP. The CSA binaries, which are stored in
$TG_COMMUNITY/ntroport, read the input files from the CU's home directory. Any output generated by the binaries will be written to the CU's home directory. This output can then be fetched using GridFTP. Note that the group permission (gid) on the CU's home directory allows for easy access by the CDs for debugging purposes. - When the request for a CSA has been approved, the CD's CSA directory is automatically created. Note again that we deliberately chose the CSA name to be different from the CU name, but you should choose them to be the same. The directory
$TG_COMMUNITY/ntroportis where the binaries (and other associated files) for the CSA will be stored. These binaries are called either directly or by the scripts located in/etc/commsh.d/ntroport/. This allows for the binaries to be updated frequently while keeping the scripts secure (since the scripts should not require frequent updating).
Instructions For Community Developers
- Request the creation of a Community Software Area. See the image at the right for an example of a filled-in request form.
Note: When the CSA is created, both the$TG_COMMUNITY/CSAdirectory and theCGgroup are automatically created. - Request the addition of a Community Account. See the image at the right for an example of a filled-in request form.
Note: When the Community Account is created, the~CUdirectory is automatically created. Also, you can get an X.509 credential forCUto be used by your gateway. - Decide on the application binaries (and optional scripts) to be used by your gateway. This step can be tricky and requires a CD to consider not only the software needed by gateway users, but also the security restrictions desired by the SA. In a simple configuration, the
commshconfiguration file allows for the execution of any binaries placed in the CSA directory. In an advanced configuration, thecommshconfiguration file allows for the execution of only a few scripts located in a protected directory. These scripts then call specific binary applications located in the CSA directory. In either case, the binary files can be updated by a CD. The advanced configuration is more secure since the protected scripts reference only particular files in the CSA directory. So, if a malicious user placed extra files in the CSA directory, they would not be of concern since they are not referenced by the protected scripts. Of course, this advanced configuration requires advanced planning by a CD and approval by a SA since the protected scripts would not be updated very often. - Simple Configuration: In a simple configuration setup, the CD places all binaries needed by the gateway in the CSA directory. The
commshconfiguration file is then written to allow any binaries in that directory to be executed with any number of command line parameters. The SA would install acommshconfiguration file, like the one given here for a CSA namedntroport.# /etc/commsh.d/ntroport/commsh.conf - Simple Configuration # Allow ntrouser to run any binary in the CSA ntroport bin directory DirectAccess $TG_COMMUNITY/ntroport/bin/* **
A single asterisk (*) will match any character in a single argument. In general, this means it will not match a space unless the space is enclosed in quotation marks or escaped with a back-slash. Additionally, an asterisk in the command itself will not match a backslash (/). Name your binaries accordingly. In contrast, a double asterisk (**) should only appear at the end of a command restriction specification, and indicates that any additional parameters will be accepted. Remember that binaries placed in the
$TG_COMMUNITY/CSAdirectory can be updated by any CD. - Advanced Configuration: In an advanced configuration, a limited number of scripts can be executed by the CU via
commsh. These scripts should be considered to be static (i.e. seldom require modification) since they will be put in a secure location accessible only by SAs. The scripts should reference binary executables which will be placed in the$TG_COMMUNITY/CSAdirectory by a CD. The binaries may be updated frequently. Thus it is the job of the CD to write the scripts appropriately. Since you have created the scripts, you should know the command line parameters. This is important since it is also the responsibility of the CD to write the configuration file referenced bycommsh. For the syntax of the directives for thecommsh.conffile, see the commsh.conf (5) man page. Your configuration file will be audited by a SA, but ultimately the onus is on you. Below is an examplecommshconfiguration file for a specific CSA namedntroport.# /etc/commsh.d/ntroport/commsh.conf - Advanced configuration # Allow ntrouser to execute only two protected scripts, which in turn # call executables in ntroport's $TG_COMMUNITY/ntroport/bin directory DirectAccess /etc/commsh.d/ntroport/script1 -input * -output * DirectAccess /etc/commsh.d/ntroport/script2 -input * -output *
Here,
script1andscript2are written by a CD and installed into a secure location by a SA. The scripts take two command line parameters, one for input and one for output. These scripts call the executables located in$TG_COMMUNITY/ntroport. - Submit your
commshconfiguration file and (optional) scripts to a SA for the appropriate TG system. The SA will review your configuration file and scripts. If acceptable, they will be installed to a secure location such as/etc/commsh.d/CSA/. - Install the binary executables (optionally referenced by your scripts) into the CSA directory
$TG_COMMUNITY/CSA. Since this directory has group access permissions for CDs, you can easily update the files there. However, keep in mind that the scripts will be run with the CU's credential and thus all input/output files will be in the~CUdirectory.
Testing and Debugging
The restrictions of the community account require CDs to adjust their testing and debugging practices. CDs should first use their own individual TeraGrid accounts to test new applications for the gateway to run on TeraGrid systems. When the application(s)s are well-tested, the CD can work with the SA to install/modify the community scripts to enable the application(s) in the CU account. CDs have access to read and write files in the CU account (via the CG's permissions) for further debugging purposes.
Instructions For System Administrators
These instructions assume that you have a functioning Globus Toolkit 4 installation as provided by the CTSS 4 Remote Compute Capability Implementation.
Once you have GT4 installed and configured, you will need to install the Community Shell (commsh) application. While this too has been documented elsewhere, detailed installation instructions are provided here so that we can configure specific directory paths.
- Download the latest version of commsh and copy it to a suitable location, i.e. somewhere you have write access. Configuring and building the
commshcode can be performed as any user. However you will need root access to do the actual installation of the program. Assuming you havewgetinstalled, you can use the following commands to get the code.wget http://security.ncsa.uiuc.edu/research/commaccts/downloads/commsh-latest.tar.gz tar xvzf commsh-latest.tar.gz
- Change into the newly extracted code directory and configure/build the
commshcode. By default, thecommshbinary will be installed in/usr/local/binand the associated configuration file will be read from/etc/commsh.conf. You can change these locations by using the "--prefix=/alternate/path/" and "--sysconfdir=/alternate/config" command line parameters when runningconfigure. Run "./configure --help" for more information../configure gmake gmake install # Note that you MUST be root to do this
- Patch GRAM so that it will interact with
commshproperly. You will need to download a patch file and apply it to theglobus-job-manager-script.plscript. This will make it possible to usecommshto implement command-based restrictions on GRAM jobs. Depending on how you installed CTSS4 or Globus/GT4, you may have more than oneglobus-job-manager-script.plfile to patch. For example, in a basic Globus/GT4 installation, you can find this script at$GLOBUS_LOCATION/libexec/. For a CTSS4 installation, you may have two separate scripts to patch, one for aglobus-wsrfinstall and one for aprews-graminstall. You may find the scripts at$TG_APPS_PREFIX/globus-wsrf-4.0.5-r0/libexec/and$TG_APPS_PREFIX/prews-gram-4.0.5-r1/libexec/respectively. (Of course the exact location depends on the version numbers of the packages that were installed.) First, get the patch file. Then change to the appropriate directories and execute thepatchcommand. The example here assumes a Globus/GT4 installation where$GLOBUS_LOCATIONhas been set.cd $GLOBUS_LOCATION/libexec wget http://security.ncsa.uiuc.edu/research/commaccts/downloads/globus-job-manager-script-pl.diff patch -p0 < globus-job-manager-script-pl.diff
Don't panic if the patch complains about "fuzz". The patch is designed to work with multiple versions of GRAM. As long as both hunks of the patch succeed, the patch has been successfully applied. You may see output similiar to the following.
patch -p0 < globus-job-manager-script-pl.diff patching file globus-job-manager-script.pl Hunk #1 succeeded at 7 with fuzz 1 (offset 5 lines). Hunk #2 succeeded at 87 with fuzz 2 (offset 6 lines).
NOTE: If you chose a different location for installing
commshby setting "--prefix=" to something other than/usr/localin the configuration step above, you MUST edit$GLOBUS_LOCATION/libexec/globus-job-manager-script.pland change the$FILTER_COMMANDvariable to point to the installed location of thecommshbinary. - Create/Edit the
/etc/commsh.confconfiguration file. When you installedcommsh, a sample configuration file was created at/etc/commsh.conf.sample. This file gives many examples of the types of restrictions you can do withcommsh. We will need only a few of these directives. You need to edit this file for your particular setup. So copy this file to/etc/commsh.confand edit the file with your favorite text editor.cp /etc/commsh.conf.config /etc/commsh.conf chown root:root /etc/commsh.conf chmod 644 /etc/commsh.conf vim /etc/commsh.conf # or use your favorite editor
You need to add an entry for every CSA on your system. The configuration below shows the configuration for a single CSA named
ntroportto be used by a CU namedntrouser.# /etc/commsh.conf # Allow any user except root to run commands through commsh AllowUser * DenyUser root # Check GRAM job submissions by the ntrouser user CheckUser ntrouser # Load the external configuration file for the ntroport CSA ReadUserConfig ntrouser /etc/commsh.d/ntroport/commsh.conf
- Create a directory to store configurations and scripts specific to each CSA. For every CSA that requires access to your machine, you will need to create a directory under
/etc/commsh.dto store (a) configuration files forcommshand (b) (optionally) scripts for that CSA which will be referenced by the configuration files. For example, if the CSA is named "ntroport", you would do the following command.mkdir -p -m 755 /etc/commsh.d/ntroport
- Install the
commsh.conffile specific to the CSA. Within the directory you created above, you need to install a configuration file which will refer to (a) (optional) scripts contained in the same directory and/or (b) executables stored in the CSA's$HOME/bindirectory. This configuration file may be created by the CD, but must be verified by a SA. For the syntax of the directives for thecommsh.conffile, see the commsh.conf (5) man page. Use the Science Gateways Administration page to view the information the CU requester submitted in the Community Account Request for additional information on how the account should be configured. Below we create a very simple configuration file for thentroportCSA which allows a CU to run any executable in the$TG_COMMUNITY/ntroport/bindirectory, with any number of command line parameters.echo 'DirectAccess $TG_COMMUNITY/ntroport/bin/* **' > /etc/commsh.d/ntroport/commsh.conf chown root:root /etc/commsh.d/ntroport/commsh.conf chmod 644 /etc/commsh.d/ntroport/commsh.conf
- Set the shell for the CU. In order to activate
commshparsing for a specific user, the CU's shell must be changed tocommsh. This involves two steps.- Edit the
/etc/passwdfile and find the line for the CUntrouser. Set the shell (typically the last entry on the line) to/usr/local/bin/commsh. - Edit the
/etc/shellsfile and append/usr/local/bin/commshto the list. Note that this second step need be done only once. You can do this with the following command.
echo '/usr/local/bin/commsh' >> /etc/shells
- Edit the
Use Case Scenario
A CD will write the code for the gateway which calls the binaries stored on the supercomputing resources. We assume that the gateway has a copy of the CU's credential, and that the CD has placed the binaries called by the gateway in the appropriate location (i.e. $TG_COMMUNITY/CSA). We also assume that commsh has been configured using the "simple configuration" where binaries are called directly. A typical scenario is as follows:
- Copy one or more local files (where "local" means the files are accessible to the gateway) to the supercomputing resource in the CU's home directory.
- Run one or more executables stored in
$TG_COMMUNITY/CSA, using the files just transferred as input, writing any output to the CU's home directory. - Copy any generated output files from the CU's home directory back to the gateway.
A simple example can be done by a CD on the command line. Here we assume that the binary sort has been placed in the CSA directory $TG_COMMUNITY/ntroport, which expands to /usr/projects/ntroport. Run the following commands from the gateway machine. Be sure to substitute the appropriate values for your supercomputing resource and associated directories.
- Be sure you run all of the following commands using the CU's credential. If the credential is stored in a MyProxy server, you can fetch it with the
myproxy-get-delegationcommand.# myproxy-get-delegation -l ntrouser -s myproxy.teragrid.org Enter MyProxy pass phrase: <password not echoed for security purposes> A credential has been received for user ntrouser in /tmp/x509up_u28289.
- Copy a local text file to the CU's home directory.
# set JOBID=12345 # globus-url-copy -v file:///full/path/to/file/input.txt \ gsiftp://your.server.com:2811/~/input.txt.$JOBID Source: file:///full/path/to/file/ Dest: gsiftp://your.server.com:2811/~/ input.txt -> input.txt.12345 - Sort the input file and write the results to a new output file in the CU's home directory. Typically such commands would be written to a dynamically generated script file, but we show them here on the command line for the sake of simplicity.
# globusrun-ws -F your.server.com -submit -streaming \ -c /usr/projects/ntroport/sort \ "--output=/home/ntrouser/output.txt.$JOBID" \ /home/ntrouser/input.txt.$JOBIDFor testing the older GRAM2 version, use the following command.
# globusrun -o -r your.server.com/jobmanager \ '&(executable=/usr/projects/ntroport/sort) \ (arguments="--output=/home/ntrouser/output.txt.12345" \ /home/ntrouser/input.txt.12345)' - Copy the resulting output file back to the local account.
# globus-url-copy -v gsiftp://your.server.com:2811/~/output.txt.$JOBID \ file:///full/path/to/file/output.txt.$JOBID Source: gsiftp://your.server.com:2811/~/ Dest: file:///full/path/to/file/ output.txt.12345
