the Addictome

NIDA Center of Excellence in Omics, Systems Genetics, and the Addictome

Genome-wide Association Studies of Substance Use and Use Disorder Where to find them, and what to do with them

April 22, 2022 by Laura Saba in human GWAS, Top 5

Abstract: This presentation will guide attendees with how to access genomewide association study summary statistics, and showcase resources available for annotating these summary data for follow-up analyses, including gene-based analyses, eQTL and epigenetic annotation as well as causal variable analysis. We will guide attendees through components of a GWAS summary dataset and two excellent resources - FUMA and MASSIVE - that use these summary files as inputs to generate vast amounts of annotations that can be brought forward to answer translational research questions.

Supported by: NIH/NIDA R01DA054869; T32DA007261; K02DA032573

Presented by:

Dr. Alexander S. Hatoum
Department of Psychiatry
Washington University School of Medicine

Lecture Slides

Mouse Phenome Database: Resources and analysis tools for curated and integrated primary mouse phenotype and genotype data

April 08, 2022 by Laura Saba

Abstract: The Mouse Phenome Database (MPD; https://phenome.jax.org ) is a widely used resource that provides access to primary experimental data, protocols and analysis tools for mouse phenotyping studies. Data are contributed by investigators around the world and represent a broad scope of phenotyping endpoints and disease-related characteristics in naïve mice and those exposed to drugs, environmental agents or other treatments. MPD is engineered to facilitate interactive data exploration and quantitative analysis. It encompasses data from inbred strains and other reproducible panels, including HMDP, KOMP, Collaborative Cross (CC), CC-RIX, and founder strains, along with primary data from mapping populations, including historic mapping crosses and advanced high-diversity mouse populations such as Diversity Outbred mice. A new Study Intake Platform (SIP) for data contributors allows domain experts to submit and annotate their own data with relevant ontology terms. Data contributors also provide detailed information for protocols and animal environmental conditions to fulfill ARRIVE guidelines. Data are exposed to analysis tools within MPD and are available through APIs to other systems. We will demonstrate selected MPD tools, including GenomeMUSter (https://muster.jax.org), a new imputed SNP grid on 650+ strains of mice at 106+M locations, and a new GWAS metanalysis tool.

Bogue MA, Ball RL, Philip VM, Walton DO, Dunn MH, Kolishovski G, Lamoureux A, Gerring M, Liang H, Emerson J, Stearns T, He H, Mukherjee G, Bluis J, Desai S, Sundberg B, Kadakkuzha B, Kunde-Ramamoorthy G, Chesler EJ. Mouse Phenome Database: towards a more FAIR-compliant and TRUST-worthy data repository and tool suite for phenotypes and genotypes. Nucleic Acids Res. 2023 Jan 6;51(D1):D1067-D1074. doi: 10.1093/nar/gkac1007. PMID: 36330959; PMCID: PMC9825561.

Ball RL, Bogue MA, Liang H, Srivastava A, Ashbrook DG, Lamoureux A, Gerring MW, Hatoum AS, Kim MJ, He H, Emerson J, Berger AK, Walton DO, Sheppard K, El Kassaby B, Castellanos F, Kunde-Ramamoorthy G, Lu L, Bluis J, Desai S, Sundberg BA, Peltz G, Fang Z, Churchill GA, Williams RW, Agrawal A, Bult CJ, Philip VM, Chesler EJ. GenomeMUSter mouse genetic variation service enables multitrait, multipopulation data integration and analysis. Genome Res. 2024 Feb 7;34(1):145-159. doi: 10.1101/gr.278157.123. PMID: 38290977; PMCID: PMC10903950.

Funding provided by NIH DA028420, DA045401, AG066346.

Presented by:

Molly Bogue and Robyn Ball
Other senior members of the MPD team: Elissa Chesler, Vivek Philip, Dave Walton
The Jackson Laboratory

HiDiver: A Suite of Methods to Merge Magnetic Resonance Histology, Light Sheet Microscopy, and Complete Brain Delineations

February 25, 2022 by Laura Saba

Abstract: We have developed new imaging and computational workflows to produce accurately aligned multimodal 3D images of the mouse brain that exploit high resolution magnetic resonance histology (MRH) and light sheet microscopy (LSM) with fully rendered 3D reference delineations of brain structures. The suite of methods starts with the acquisition of geometrically accurate (in-skull) brain MRIs using multi-gradient echo (MGRE) and new diffusion tensor imaging (DTI) at an isotropic spatial resolution of 15 μm. Whole brain connectomes are generated using over 100 diffusion weighted images acquired with gradients at uniformly spaced angles. Track density images are generated at a super-resolution of 5 μm. Brains are dissected from the cranium, cleared with SHIELD, stained by immunohistochemistry, and imaged by LSM at 1.8 μm/pixel. LSM channels are registered into the reference MRH space along with the Allen Brain Atlas (ABA) Common Coordinate Framework version 3 (CCFv3). The result is a high-dimensional integrated volume with registration (HiDiver) that has a global alignment accuracy of 10–50 μm. HiDiver enables 3D quantitative and global analyses of cells, circuits, connectomes, and CNS regions of interest (ROIs). Throughput is sufficiently high that HiDiver is now being used in comprehensive quantitative studies of the impact of gene variants and aging on rodent brain cytoarchitecture.

Presented by:

Dr. G Allan Johnson
Charles E Putman Professor of Radiology, Physics, and Biomedical Engineering
Duke University
Durham, North Carolina

Papers Shared During Presentation

Variability and heritability of mouse brain structure: Microscopic MRI atlases and connectomes for diverse strains

A multicontrast MR atlas of the Wistar rat brain

HiDiver: A Suite of Methods to Merge Magnetic Resonance Histology, Light Sheet Microscopy, and Complete Brain Delineations

Julia: a fast, friendly, and powerful language for data science

November 12, 2021 by Laura Saba

Julia is a high-level dynamic programming language that is gaining popularity. The Julia language is designed for scientific computing and offers several attractive features for data science applications. In this webinar, we will make a case for why a data scientist might consider taking a serious look at Julia. We will show code examples and point the audience to further resources.

Goals of this webinar:

To articulate why Julia is attractive for data scientists
To provide an overview of Julia language syntax and design
To provide additional resources about the Julia language and ecosystem

Presented by:

Gregory Farage and Saunak Sen gfarage@uthsc.edu / sen@uthsc.edu Division of Biostatistics Department of Preventive Medicine University of Tennessee Health Science Center Memphis, TN

Resources

Tutorial

Extra

Annual Julia User & Developer Survey 2021 presented by Andrew Claster.
Annual Julia User & Developer Survey 2019 presented by Viral Shah.
Announcing composable multi-threaded parallelism in Julia
Remark.jl created by Pietro Vertechi
"Create markdown presentations from Julia"
JuDoc.jl created by Thibaut Lienart
"Static site generator. Simple, fast, compatible with basic LaTeX, maths with KaTeX, optional pre-rendering, written in Julia."
Pluto.jl created by Fons Varder Plas
"Reactive Notebook, written in Julia."
Plots.jl created by Tom Breloff
"Plotting metapackage: it's an interface over many different plotting libraries."

Useful links

A Comprehensive Tutorial to Learn Data Science with Julia from Scratch

10 Reasons Why You Should Learn Julia

Noteworthy Differences from other Languages

Julia Cheat Sheet

Guide to evaluating the application of machine learning methods in genetics literature

October 22, 2021 by Laura Saba

Goals of this webinar:

To describe the relationship between artificial intelligence (AI), machine learning (ML), and deep learning (DL).
To describe general scenarios when ML is appropriate.
To understand methods for comparing the performance of different ML algorithms
To layout general criteria to examine when evaluating literature that includes machine learning algorithms

Presented by:

Laura Saba, PhD
Associate Professor
Department of Pharmaceutical Sciences
Skaggs School of Pharmacy and Pharmaceutical Sciences
University of Colorado Anschutz Medical Campus
Aurora, CO

lecture slides

References

Liu Y, Chen PC, Krause J, Peng L. How to Read Articles That Use Machine Learning: Users' Guides to the Medical Literature. JAMA. 2019 Nov 12;322(18):1806-1816. doi: 10.1001/jama.2019.16489. PMID: 31714992.

Rajkomar A, Dean J, Kohane I. Machine Learning in Medicine. N Engl J Med. 2019 Apr 4;380(14):1347-1358. doi: 10.1056/NEJMra1814259. PMID: 30943338.

Libbrecht MW, Noble WS. Machine learning applications in genetics and genomics. Nat Rev Genet. 2015 Jun;16(6):321-32. doi: 10.1038/nrg3920. Epub 2015 May 7. PMID: 25948244; PMCID: PMC5204302.

A Primer on Brain Proteomics and protein-QTL Analysis for Substance Use Disorders

October 08, 2021 by Laura Saba

Goals of this webinar:

To give a general introduction to proteomics technologies and data processing/normalization
To present a pipeline for correcting sample mix-ups in proteomic data.
To discuss rat brain proteome and protein QTL analysis for Substance Use Disorders.

Presented by:

Xusheng Wang, PhD
Assistant Professor
Department of Biology
University of North Dakota

Robert W. Williams, PhD
Professor and Chair
Department of Genetics, Genomics, and Informatics
University of Tennessee Health Science Center

Lecture Notes

Organizing data in spreadsheets

September 24, 2021 by Laura Saba

Abstract: Spreadsheets are widely used software tools for data entry, storage, analysis, and visualization. Focusing on the data entry and storage aspects, this presentation will offer practical recommendations for organizing spreadsheet data to reduce errors and ease later analyses. The basic principles are: be consistent, write dates like YYYY-MM-DD, do not leave any cells empty, put just one thing in a cell, organize the data as a single rectangle (with subjects as rows and variables as columns, and with a single header row), create a data dictionary, do not include calculations in the raw data files, do not use font color or highlighting as data, choose good names for things, make backups, use data validation to avoid data entry errors, and save the data in plain text files.

Presented by:

Karl Broman, PhD
Professor
Department of Biostatistics & Medical Informatics
University of Wisconsin-Madison

Lecture slides

original paper

A Rube Goldbergian Approach to Scheduling Rodent Behavior Experiments and Data Collection

September 10, 2021 by Laura Saba

Abstract: Large-scale rodent behavioral experiments with complicated testing procedures conducted over several years (e.g. genetic mapping of operant drug taking) need rigorous control on the quality of the data. This webinar will discuss methods used in my lab where we generate ready to use MedPC macros from a spreadsheet for new test sessions, cell phone notification on the completion of behavioral tests, nightly automated data assembly, daily notification of procedural changes for individual animals. Potential errors are checked automatically at several points with messages sent to the users. This system is put together using a relational database (sqlite), several ad hoc computer programs (perl, python, or shell), a cloud storage service (Dropbox), and a messaging system (slack). By turning much of the experiment planning and error checking procedure into computer code, we improve experimental efficiency and data quality.

Presented by:

Hao Chen, PhD
Associate Professor
Department of Pharmacology, Addiction Science, and Toxicology
University of Tennessee Health Science Center

Lecture Materials

Introduction to DNA Methylation Platforms and Data Analysis

August 27, 2021 by Laura Saba

Goals of this webinar:

Studying DNA methylation is widespread in biomedical research. The goals of this webinar are:

To describe research questions that can be explored by profiling the methylome
To give a general overview of DNA methylation profiling technologies
To outline steps in DNA methylation analysis pipeline
To provide information on common resources and databases

Presented by:

Katerina Kechris, PhD
Professor
Department of Biostatistics and Informatics
Colorado School of Public Health
University of Colorado Anschutz Medical Campus

Lecture Materials

Identifying sample mix-ups in eQTL data

June 11, 2021 by Laura Saba

Goals of this webinar:

Sample mix-ups interfere with our ability to detect genotype-phenotype associations. However, the presence of numerous eQTL with strong effects provides the opportunity to not just identify sample mix-ups, but also to correct them.

To illustrate methods for identifying sample duplicates and errors in sex annotations
To illustrate methods for identifying sample mix-ups in DNA and RNA samples from experimental cross data

Presented by:

Karl Broman, PhD
Professor
Department of Biostatistics and Medical Informatics
University of Wisconsin–Madison

Lecture SLides

Lecture Slides with notes

github repository

Introduction to the Hybrid Rat Diversity Panel: A renewable rat panel for genetic studies of addiction-related traits

April 23, 2021 by Laura Saba

Goals of this webinar:

Inbred model organisms
Recombinant inbred panels
Why rats?
Hybrid Rat Diversity Panel
Current resources
Data integration demo
Where to now?

Presented by:

Hao Chen, PhD
Associate Professor
Department of Pharmacology, Addiction Science, and Toxicology
University of Tennessee Health Science Center

Laura Saba, PhD
Associate Professor
Department of Pharmaceutical Sciences
Skaggs School of Pharmacy and Pharmaceutical Sciences
University of Colorado Anschutz Medical Campus

Lecture Material

Introduction to Metabolomics Platforms and Data Analysis

April 09, 2021 by Laura Saba

Goals of this webinar:

The use of metabolomics to profile small molecules is now widespread in biomedical research. The goals of this webinar are:

To describe research questions that can be addressed using metabolomics
To give a general overview of metabolomics technologies
To outline steps in a metabolomics data analysis pipeline
To provide information on common resources and databases

Presented by:

Katerina Kechris, PhD
Professor
Department of Biostatistics and Informatics
Colorado School of Public Health
University of Colorado Anschutz Medical Campus

Lecture notes

Landing on Jupyter: A guided tour of interactive notebooks

March 26, 2021 by Laura Saba

Goals of this webinar:

Jupyter is an interactive interface to data science and scientific computing across a variety of programming languages. We will present the Jupyter notebook, and explain some key concepts (e.g., kernel, cells). We will show how to create a new notebook; modify an existing notebook; save, export, and publish a notebook. We will discuss several possible use cases: developing code, writing reports, taking notes, and teaching/presenting.

Objectives:
- Learn what Jupyter notebooks are
- Learn how to install, configure, and use Jupyter notebooks
- Learn how to use Jupyter notebooks for research, teaching, or code development

Presented by:
Dr. Gregory Farage & Dr. Saunak Sen
Department of Preventative Medicine
University of Tennessee Health Science Center

Lecture slides

other resources on github

Become a UseR: A brief tour of R

March 12, 2021 by Laura Saba

Goals of this webinar:
We will introduce R programming language and outline the benefits of learning R. We will give a brief tour of basic concepts and tasks: variables, objects, functions, basic statistics, visualization, and data import/export. We will showcase a practical example demonstrating statistical analysis.

Why should one use/learn R?
How to install R/Rstudio
Learn about R basics: variables, programming, functions
Learn about the R package ecosystem that extends its capabilities
See a basic statistical analysis example
Learn about additional resources

Presented by:
Dr. Gregory Farage & Dr. Saunak Sen
Department of Preventative Medicine
University of Tennessee Health Science Center

lecture resources on github

From GWAS to gene: what are the essential analyses and how do we bring them together using heterogeneous stock rats?

February 26, 2021 by Laura Saba

Goals of this webinar:
Heterogeneous stock (HS) rats are an outbred population that was created in 1984 by intercrossing 8 inbred strains. The Center for GWAS in Outbred Rats (http://www.ratgenes.org) has developed a suite of analysis tools for analyzing genome wide association studies (GWAS) in HS rats

explain the HS rat population and their history
describe the automated pipeline that performs GWAS in HS rats
explore the fine mapping of associated regions and explain the various secondary analyses that we use to prioritize genes within associated intervals

Presented by:
Dr. Abraham Palmer
Professor and Vice Chair for Basic Research
Department of Psychiatry
University of California San Diego

Lecture Slides

NIDA Center for GWAS in HS Rats

Beginner’s guide to bulk RNA-Seq Analysis

February 12, 2021 by Laura Saba in Top 5

Goals of this webinar:
The use of high throughput short read RNA sequencing has become common place in many scientific laboratories. The analysis tools for quantitating a transcriptome have matured becoming relatively simple to use. The goals of this webinar are:

To give a general overview of the popular Illumina technology for sequencing RNA.
To outline several of the key aspects to consider when designing a RNA-Seq study
To provide guidance on methods and tools for transforming reads to quantitative expression measurements.
To describe statistical models that are typically used for differential expression and why these specialized models are needed.

Presented by:
Dr. Laura Saba
Associate Professor
Department of Pharmaceutical Science
Skaggs School of Pharmacy and Pharmaceutical Sciences
University of Colorado Anschutz Medical Campus

Lecture Slides

Github link to other resources

Sketching alternate realities: An introduction to causal inference in genetic studies

November 20, 2020 by Laura Saba

Goals of this webinar:
Determination of cause is an important goal of biological studies, and genetic studies provide unique opportunities. In this introductory lecture we will frame causal inference as a missing data problem to clarify challenges, assumptions, and strategies necessary for assigning cause. We will survey the use of directed acyclic graphs (DAGs) to express causal information and to guide analytic strategies.

Express causal inference as a missing data problem (counterfactual framework)
Outline assumptions needed for causal inference
Express causal information as (directed acyclic) graphs
Outline how to use graphs to guide analytic strategy

Presented by:
Dr. Saunak Sen
Professor and Chief of Biostatistics
Department of Preventative Medicine
University of Tennessee Health Science Center

source code on github

Lecture Material

Introduction to GeneWeaver: Integrating and analyzing heterogeneous functional genomics data

October 23, 2020 by Laura Saba

Goals of this webinar:

Compare a user's gene list with multiple functional genomics data sets
Compare and contrast gene lists with data currently available and integrated in GeneWeaver
Explore functional relationships among genes and disease across species

Presented by:
Dr. Elissa Chesler
Professor
The Jackson Laboratory
AND
Dr. Erich Baker
Professor and Chair
Department of Computer Science
Baylor University