Skip to main content
Premium Trial:

Request an Annual Quote

Baylor Researchers Develop Generative AI Assistant for PGx Interpretation


NEW YORK – Researchers at Baylor College of Medicine have developed an artificial intelligence-based assistant to help providers and patients interpret pharmacogenetic testing information that they hope to eventually implement clinically. 

The development of the assistant follows the launch of Open AI's ChatGPT tool in 2022, which has ushered in new possibilities to improve workflows and processes across virtually every field. In the case of Mullai Murugan, the director of software engineering at Baylor's Human Genome Sequencing Center and one of the developers of the chatbot, she and her colleagues were interested in how generative artificial intelligence could benefit their work, particularly in interpreting PGx results. In particular, they wanted to create a proof of concept for an AI assistant that could "provide guidance on general pharmacogenomic testing, dosage implications, side effects of therapeutics, and also address typical patient concerns," she said. 

PGx testing is a significant component of the center's clinical laboratory, and "we always hear about how genetic test results are hard to understand and interpret, not only for laypersons or patients but for providers, as well," Murugan said. 

The researchers focused their work on PGx testing because of clearer guidelines in the field and because "a vast majority of the folks that are tested might have a PGx phenotype that could potentially affect either the efficacy of the medication or the toxicity," she said. 

Two major deal-breakers with using ChatGPT for this purpose, she said, are that there is a specific knowledge cutoff date, meaning no new information has been added after a certain date, and that it's difficult to assess exactly what data ChatGPT has been trained on or what PGx guidelines it knows about. By creating their own AI assistant, the researchers would be able to control both the cutoff date and the exact data used for the knowledgebase. 

In their pilot study, which was described in a paper published earlier this month in the Journal of the American Medical Informatics Association, the researchers focused on statins because confining their research to one type of drug would make it easier to assess the performance of the AI assistant. The assistant uses retrieval augmented generation, a technique to improve the reliability of generative AI by collecting data from external sources, and harnesses a knowledgebase that includes data from the Clinical Pharmacogenetics Implementation Consortium, which provides PGx clinical practice guidelines, the researchers wrote. The knowledgebase also includes publications on statin use, Murugan said. The AI assistant uses GPT-4, the underlying model of ChatGPT, to generate tailored responses to questions based on the knowledgebase that are further refined through prompt engineering and guardrails, according to the paper. 

They evaluated their assistant against a specialized PGx question catalog and found that it showed "high efficacy in addressing user queries," the researchers wrote. "Compared with OpenAI’s ChatGPT 3.5, it demonstrated better performance, especially in provider-specific queries requiring specialized data and citations."

When asked a question, the assistant utilizes the knowledgebase to retrieve information related to the question — "not exactly the answer, but information that provides context," Murugan said. The assistant then takes the context and the question and includes a "fairly extensive prompt" that serves as a guide for what the answer should include. Prompts are "very powerful," because they can indicate what role the assistant should play, what information should be returned, and any safeguards or warnings that should be included, Murugan noted. All this information is then sent to GPT-4, which provides the final answer that is sent to the user. 

The prompt engineering component is particularly important, emphasized Bo Yuan, the clinical laboratory director of Baylor's Human Genome Sequencing Center and another author on the JAMIA paper. The prompt offers "guardrails" so that responses are generated within the context of the knowledgebase and don't go off script and stray into unknown material. 

If a patient asks the question, the prompt will reflect that the answer should be more patient-friendly and in less technical language than if a provider had asked the question, Murugan noted. The researchers see their assistant as an "augmentation of what [the provider] can do," not a replacement, she said, and in a clinical setting, the assistant would most likely be integrated into the electronic medical record system that either a patient or provider could use to get more information. 

The tool would also help improve accessibility and equity for patients, as they will be able to access health information in a way that's easy to understand and makes their test results clearer, she said. 

However, the assistant is not yet ready for clinical implementation. In its pilot study, which used synthetic data, the retrieval model that extracts the information from the knowledgebase did not recognize some specific PGx terms and lacks training in the typical language used by providers and genetic counselors to explain results, Murugan said. There are also still regulatory, safety, and legal concerns related to AI that must be addressed. 

AI legislation is "very nascent," and it is not clear what the regulations for its use in healthcare will be, she said. The researchers didn't use real patient data due to the concerns about adhering to the Health Insurance Portability and Accountability Act. In addition, because the assistant is based on GPT-4, which is not hosted on local servers, there are challenges involved in making sure the data is completely secure. The researchers plan to adhere to the White House's guidelines on artificial intelligence for healthcare, Murugan noted. 

Earlier this month, the US Food and Drug Administration released a paper laying out its position on regulating AI. The agency's four areas of focus are fostering collaboration to safeguard public health, advancing the development of regulatory approaches that support innovation, promoting the development of harmonized standards, guidelines, best practices, and tools, and supporting research related to the evaluation and monitoring of AI performance. 

One of the safeguards in place via the prompt engineering with Baylor's assistant is the emphasis within the responses to patients that this is not a diagnosis, but rather a way to provide additional information on their PGx test results, Murugan said. The responses also emphasize the need to discuss results with a physician or genetic counselor. The assistant is "not even close to being error-proof," and there is a need to have "a human in the loop" to help guide it, she added. 

Yuan also noted that because AI is still an early-stage technology, the understanding of potential safety concerns and implications is limited — which is why they began with a proof-of-concept pilot study, to better understand the broader implications of using AI this way. 

The next step is to test the assistant in a cohort of real patients with real data to evaluate its responses and to further "fine-tune" the model, Yuan said. The researchers also plan to do implementation studies to better understand how the assistant can be improved and to understand its clinical utility. 

Yuan noted that the researchers focused on just one kind of drug and a few specific genes for the pilot study, so its process of creating the knowledgebase and context was very targeted, specific, and manual. Because the knowledgebase for pharmacogenomics is "very complex and heterogeneous," it will take more work to figure out how to obtain that information and organize and harmonize it, he said. The researchers are investigating ways to make the database creation more efficient and automated as well, he noted. 

Murugan added that the researchers are aiming to increase the number of genes and the number of medications it incorporates into the assistant.