We regularly run analyses that involve tens of thousands of genomes. Since such workloads are typically bursty, we use on-demand cloud resources, typically on Google Cloud or Microsoft Azure.
Our main workflow scheduling system is Hail Batch, for which we have set up a local deployment. It integrates directly with Hail Query, a set of scalable APIs designed specifically for genomics. For workflows like GATK-SV, we rely on Cromwell / Terra to run WDL.
About a dozen collaborating groups in Australia use our local deployment of seqr for rare disease analysis. Internally, we continue the development of Broad’s loss-of-function curation portal.
Our public data browsers typically use Django and React on the frontend, with Elasticsearch or Hail in the backend.
All our sample metadata is managed centrally with an extensive set of APIs, which allows us to automate our workflows and ingest new data regularly without incurring toil.
We like to set up our infrastructure as code either through Terraform or Pulumi, which helps to bring up consistent dev / prod namespaces across multiple clouds.
All our code is available on GitHub. We control production data access on a dataset level and enforce code reviews through an analysis runner wrapper, while allowing quick prototyping and exploration on subsets for testing.
cpg-utils: a set of helpers to build reusable pipelines
production-pipelines: our large cohort processing + QC pipeline
tob-wgs: analyses related to the Tasmanian Ophthalmic Biobank Whole Genome Sequencing project
structural-constraint: calculating missense constraint within protein tertiary structures
tob-wgs-browser: the public data browser for the Tasmanian Ophthalmic Biobank Whole Genome Sequencing project
The Centre for Population Genomics is a national initiative, jointly housed at the Garvan Institute of Medical Research and the Murdoch Children’s Research Institute. We work all across the country and collaborate with partners internationally and in Australia.
At CPG, we celebrate and respect diversity in our team and our work. We believe that including all human diversity in genomic research will empower medical care that benefits everyone.
We acknowledge the First Australian peoples on whose traditional lands we live and work across the country and pay our respects to their elders past, present and emerging. We gratefully accept the invitation in the Uluru Statement from the Heart “to walk with us in a movement of the Australian people for a better future”.