Cannabis informatics

Understanding the role of data in the cannabis industry

My newest research interest centers on the role of data in the emerging legal cannabis industry. Cannabis legalization is an example of a profound social disruption whose consequences are rippling through a fascinating assemblages of institutions, technologies, and infrastructures. Cannabis legalization is one of the most profound shifts in public opinion and policy in decades, but it remains a legally precarious industry as state-level legalization initiatives since 2012 have not overturned federal drug laws. Data science is already a central component in the “digital transformation” of other industries, so the role of “new” capabilities like data science within a “new” industry like cannabis is a fascinating and important interface of social forces and technical capabilities that warrants sustained empirical attention. Inspired by social informatics, crisis informatics, and philanthropic informatics, I coined the phrase “cannabis informatics” to refer to the emergence, mediation, and consequences of relationships among plants, people, organizations, and institutions through technologies of surveillance, analytics, and influence.

Applying data science methods

My research in the space of cannabis informatics began in 2017 around agenda-setting and by 2020 I had begun to publish empirical papers applying data science methods. I organized a panel at CHI 2017 in Denver outlining a five-part framework of research questions and priorities for HCI researchers to explore in a post-legalization world [1] and became one of the core faculty members at the new Center for Research and Education Addressing Cannabis and Health (REACH). I attended the CSU Pueblo’s Institute for Cannabis Research’s annual conference in 2018 and 2019 presenting a framework on the “social cannabinoid system” that would become cannabis informatics as well as moderating a panel on the role of data analytics in the industry. Through my CU REACH affiliation, I began to work with Dr. Daniela Vergara, a research scientist in the Department of Ecology and Evolutionary Biology and an expert in cannabis genetics. We secured an anonymized data set of chemical profile data from a testing lab and applied a variety of imputation methods to estimate the missing values [2]. I also established formal research data sharing agreements with The Farm (a Boulder-based dispensary) and Leafly (an online platform for cannabis reviews). My coauthors and I analyzed chemical data characterizing the combination of terpene and cannabinoid profiles for more than 80,000 strains from across the country [3].

New Work: Data-ifying precarious industries.

The way the profound legal tensions are managed as the cannabis industry navigates disruptive policy changes is through infrastructures and ideologies of digital transformation. I am specifically interested in the complex assemblages of supply chain technologies and data management practices known as “seed-to-sale” or “track-and-trace” systems (like Metrc) that are implemented by regulators to document the provenance of every product across the entire market in states like California and Colorado. This project is a return to my science and technology studies (STS) roots for which I will use mixed methodological approaches to trace the emergence of this unique data culture in response to pressures from regulators, investors, and customers. This project contributes to discussions about the possibilities and consequences of data infrastructures to support social change, the consequences of digital transformation and wide-scale surveillance on social behavior, and how data literacies mediate access to emerging economic opportunities.


  1. CHI-Nnabis: Implications of Marijuana Legalization for and from Human-Computer Interaction
    Keegan, Brian C., Cavazos-Rehg, Patricia, Nguyen, Anh Ngoc, Savage, Saiph, Kaye, Jofish, De Choudhury, Munmun, and Paul, Michael J.
    In Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems 2017
  2. Modeling Cannabinoids from a Large-Scale Sample of Cannabis Sativa Chemotypes
    Vergara, Daniela, Gaudino, Reggie, Blank, Thomas, and Keegan, Brian C.
    PLOS ONE Sep 2020
  3. The Phytochemical Diversity of Commercial Cannabis in the United States
    Smith, Christiana J., Vergara, Daniela,  Keegan, Brian C., and Jikomes, Nick
    PLOS ONE May 2022