0:00
/
0:00

Google Batch for Bioinformatics

Testing with samtools

Here’s a short screencast recapping my initial findings when testing the Google Batch API with samtools. I used the open source samtools container and pulled a BAM file from a public dataset.

My `job.json` file for configuring Google Batch to test a `samtools index` example is shown below:

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "container": {
                            "imageUri":"gcr.io/cloud-lifesciences/samtools",
                            "entrypoint": "/bin/sh",
                            "commands": [
                                "-c",    
                                "samtools index ${BAM} /mnt/disks/share/${BAI}"            
                            ]
                        },
                        "environment": {
                            "variables": {
                                "BAM": "gs://genomics-public-data/NA12878.chr20.sample.bam",
                                "BAI": "NA12878.chr20.sample.bam.bai"                             
                            }
                        }
                    }
                ],
                    "volumes": [
                        {
                            "gcs": {
                                "remotePath": "<MYBUCKET>/<MYPATH/"
                            },
                            "mountPath": "/mnt/disks/share"
                        }
                    ],
                "computeResource": {
                    "cpuMilli": 2000,
                    "memoryMib": 2000
                },
                "maxRetryCount": 3,
                "maxRunDuration": "100000s"
            },
            "taskCount": 1,
            "parallelism": 10
        }
    ],
    "logsPolicy":{
        "destination": "CLOUD_LOGGING"
    }
}

More on my continued GCP Compute Service testing and comparisons using `samtools` examples at my `gcp-for-bioinformatics` repo on GitHub - link

Discussion about this video

User's avatar