cerise.back_end package¶
Subpackages¶
- cerise.back_end.test package
- Submodules
- cerise.back_end.test.conftest module
- cerise.back_end.test.mock_job module
- cerise.back_end.test.test_cwl module
- cerise.back_end.test.test_job_planner module
- cerise.back_end.test.test_job_runner module
- cerise.back_end.test.test_local_files module
- cerise.back_end.test.test_remote_api module
- cerise.back_end.test.test_remote_job_files module
- Module contents
Submodules¶
cerise.back_end.cwl module¶
-
cerise.back_end.cwl.
get_cwltool_result
(cwltool_log: str) → cerise.job_store.job_state.JobState[source]¶ Parses cwltool log output and returns a JobState object describing the outcome of the cwl execution.
Parameters: cwltool_log – The standard error output of cwltool Returns: Any of JobState.PERMANENT_FAILURE, JobState.TEMPORARY_FAILURE or JobState.SUCCESS, or JobState.SYSTEM_ERROR if the output could not be interpreted.
-
cerise.back_end.cwl.
get_files_from_binding
(cwl_binding: Dict[str, Any]) → List[cerise.back_end.file.File][source]¶ Parses a CWL input or output binding an returns a list containing name: path pairs. Any non-File objects are omitted.
Parameters: cwl_binding – A dict structure parsed from a JSON CWL binding Returns: - A list of File objects describing the input files described
- in the binding.
-
cerise.back_end.cwl.
get_required_num_cores
(cwl_content: bytes) → int[source]¶ Takes a CWL file contents and extracts number of cores required.
Parameters: cwl_content – The contents of a CWL file. Returns: The number of cores required, or 0 if not specified.
-
cerise.back_end.cwl.
get_secondary_files
(secondary_files: List[Dict[str, Any]]) → List[cerise.back_end.file.File][source]¶ Parses a list of secondary files, recursively.
Parameters: secondary_files – A list of values from a CWL secondaryFiles attribute. Returns: A list of secondary input files.
-
cerise.back_end.cwl.
get_time_limit
(cwl_content: bytes) → int[source]¶ Takes a CWL file contents and extracts cwl1.1-dev1 time limit.
Supports only two of three possible ways of writing this. Returns 0 if no value was specified, in which case the default should be used.
Parameters: cwl_content – The contents of a CWL file. Returns: Time to reserve in seconds.
-
cerise.back_end.cwl.
get_workflow_step_names
(workflow_content: bytes) → List[str][source]¶ Takes a CWL workflow and extracts names of steps.
This assumes that the steps are not inlined, but referenced by name, as we require for workflows submitted to Cerise. Also, this is not the name of the step in the workflow document, but the name of the step in the API to run. It’s the content of the
run
attribute, not that of theid
attribute.Parameters: workflow_content – The contents of the workflow file. Returns: A list of step names.
-
cerise.back_end.cwl.
is_workflow
(workflow_content: bytes) → bool[source]¶ Takes CWL file contents and checks whether it is a CWL Workflow (and not an ExpressionTool or CommandLineTool).
Parameters: workflow_content – a dict structure parsed from a CWL file. Returns: - True iff the top-level Process in this CWL file is an
- instance of Workflow.
cerise.back_end.execution_manager module¶
-
class
cerise.back_end.execution_manager.
ExecutionManager
(config: cerise.config.Config, local_api_dir: cerulean.path.Path)[source]¶ Bases:
object
Handles the execution of jobs on the remote resource. The execution manager monitors the job store for files that are ready to be staged in, started, cancelled, staged out, or deleted, and performs the required activity. It also monitors the remote resource, ensuring that any remote state changes are propagated to the job store correctly.
Set up the execution manager.
Parameters: - config – The configuration.
- local_api_dir – The path to the local API directory.
cerise.back_end.file module¶
-
class
cerise.back_end.file.
File
(name: Optional[str], index: Optional[int], location: str, secondary_files: List[File])[source]¶ Bases:
object
Create a File object.
This describes a file, and is the result of resolving input files from the user-submitted input description, or output generated by the CWL runner. It is used by the staging machinery to stage these files, and update the input description with remote paths.
Parameters: - name – The name of the input for which this file is.
- index – The index of this file into an array of Files.
- location – A URL with the (local) location of the file.
- secondary_files – A list of secondary files.
-
index
= None¶ The index of this file, if it is in an array of files.
-
location
= None¶ Local URL of the file.
-
name
= None¶ The input name for which this file is.
-
secondary_files
= None¶ CWL secondary files.
-
source
= None¶ The source of the file.
cerise.back_end.job_planner module¶
-
class
cerise.back_end.job_planner.
JobPlanner
(job_store: cerise.job_store.sqlite_job_store.SQLiteJobStore, local_api_dir: cerulean.path.Path)[source]¶ Bases:
object
Handles workflow execution requirements.
This class keeps track of which hardware is needed for each available step, then analyses a workflow and decides which resources it needs based on this.
Create a JobPlanner.
Parameters: - job_store – The job store to act on.
- local_api_dir – Path of local api directory.
cerise.back_end.job_runner module¶
-
class
cerise.back_end.job_runner.
JobRunner
(job_store: cerise.job_store.sqlite_job_store.SQLiteJobStore, config: cerise.config.Config, remote_cwlrunner: str)[source]¶ Bases:
object
Create a JobRunner object.
Parameters: - job_store – The job store to get jobs from.
- config – The configuration.
- remote_cwlrunner – The location of the CWL runner to use.
-
cancel_job
(job_id: str) → bool[source]¶ Cancel a running job.
Job must be cancellable, i.e. in JobState.RUNNING or JobState.WAITING. If it isn’t cancellable, this function does nothing.
Cancellation may not happen immediately. If the cancellation request has been executed immediately and the job is now gone, this function returns False. If the job will be cancelled soon, it returns True.
Parameters: job_id – The id of the job to cancel. Returns: Whether the job is still running.
cerise.back_end.local_files module¶
-
class
cerise.back_end.local_files.
LocalFiles
(job_store: cerise.job_store.sqlite_job_store.SQLiteJobStore, config: cerise.config.Config)[source]¶ Bases:
object
Create a LocalFiles object. Sets up local directory structure as well.
Parameters: - job_store – The job store to use
- config – The configuration.
-
create_output_dir
(job_id: str) → None[source]¶ Create an output directory for a job.
Parameters: job_id – The id of the job to make a work directory for.
-
delete_output_dir
(job_id: str) → None[source]¶ Delete the output directory for a job. This will remove the directory and everything in it.
Parameters: job_id – The id of the job whose output directory to delete.
-
publish_job_output
(job_id: str, output_files: List[cerise.back_end.file.File]) → None[source]¶ Write output files to the local output dir for this job.
Uses the .output_files property of the job to get data, and updates its .output property with URLs pointing to the newly published files, then sets .output_files to None.
Parameters: - job_id – The id of the job whose output to publish.
- output_files – List of output files to publish.
-
resolve_input
(job_id: str) → List[cerise.back_end.file.File][source]¶ Resolves input (workflow and input files) for a job.
This function will read the job from the database, add a .workflow_content attribute with the contents of the workflow, and return a list of File objects describing the input files.
This function will accept local file:// URLs as well as remote http:// URLs.
Parameters: job_id – The id of the job whose input to resolve. Returns: A list of File objects to stage.
cerise.back_end.remote_api module¶
-
class
cerise.back_end.remote_api.
RemoteApi
(config: cerise.config.Config, local_api_dir: cerulean.path.Path)[source]¶ Bases:
object
Manages the remote API installation.
This class manages the remote directories in which the CWL API is installed, which is <basedir>/api/
Within this, there is a directory per project, with entries
<project>/version <project>/steps/… <project>/files/… <project>/install.sh
Create a RemoteApiFiles object. Sets up remote directory structure as well, but refuses to create the top-level directory.
Parameters: - config – The configuration.
- local_api_dir – The path to the local API dir to install from.
-
get_projects
() → List[str][source]¶ Return names and versions of the installed projects.
Returns: - A list of strings, one for each project, with name and
- version.
-
install
() → None[source]¶ Install the API onto the compute resource.
Copies subdirectories steps/ and files/ of the given local api dir to the compute resource, copies files/ to the compute resource, and runs the install script.
-
translate_runner_location
(runner_location: str) → str[source]¶ Perform macro substitution on CWL runner location.
This replaces $CERISE_API with the API base dir.
Parameters: runner_location (str) – Location of the runner as configured by the user. Returns: (str) A remote path with variables substituted.
-
translate_workflow
(workflow_content: bytes) → bytes[source]¶ Parse workflow content, check that it calls steps, and insert the location of the steps on the remote resource so that the remote runner can find them.
Also converts YAML to JSON, for cwltiny compatibility.
Parameters: workflow_content – The raw workflow data Returns: The modified workflow data, serialised as JSON
cerise.back_end.remote_job_files module¶
-
class
cerise.back_end.remote_job_files.
RemoteJobFiles
(job_store: cerise.job_store.sqlite_job_store.SQLiteJobStore, config: cerise.config.Config)[source]¶ Bases:
object
Manages a remote directory structure. Expects to be given a remote dir to work within. Inside this directory, it makes a jobs/ directory, and inside that there is a directory for every job.
Within each job directory are the following files:
- jobs/<job_id>/name.txt contains the user-given name of the job
- jobs/<job_id>/workflow.cwl contains the workflow to run
- jobs/<job_id>/work/ contains input and output files, and is the working directory for the job.
- jobs/<job_id>/stdout.txt is the standard output of the CWL runner
- jobs/<job_id>/stderr.txt is the standard error of the CWL runner
Create a RemoteJobFiles object. Sets up remote directory structure as well, but refuses to create the top-level directory.
Parameters: - job_store – The job store to use.
- config – The configuration.
-
delete_job
(job_id: str) → None[source]¶ Remove the work directory for a job. This will remove the directory and everything in it, if it exists.
Parameters: job_id – The id of the job whose work directory to delete.
-
destage_job_output
(job_id: str) → List[cerise.back_end.file.File][source]¶ Download results of the given job from the compute resource.
Parameters: job_id – The id of the job to download results of. Returns: A list of (name, path, content) tuples.
-
stage_job
(job_id: str, input_files: List[cerise.back_end.file.File], workflow_content: bytes) → None[source]¶ Stage a job. Copies any necessary files to the remote resource.
Parameters: - job_id – The id of the job to stage
- input_files – A list of input files to stage.
- workflow_content – Translated contents of the workflow to be run.