Google Gemma 2 Academic Award 2024
I received the Google Gemma 2 Academic Program GCP Credit Award!
I am very excited about this award! The GCP credits award of $10,000 will help support my research on providing provable guarantees for the output domain generated content of LLMs.
Summary
There are a number of settings where one may want to restrict the output of large language models (LLMs) trained on web-scale data to a set of allowable topics. This could be due to safety concerns or to prevent the misuse of freely available systems. There have been various examples of such cases, such as when Air Canada released an LLM-based chatbot that gave incorrect advice, and a General Motors chatbot recommended Tesla cars to potential customers looking to buy the ‘best electric car.’ Additionally, LLMs can be hijacked to respond to out-of-domain questions. Often, these models are fine-tuned to improve their performance and to discourage conversations that are outside the target domain. While fine-tuning can reduce unwanted responses, its effects are often superficial and can be easily reverted by relatively unsophisticated attacks on LLMs. This project is about providing provable guarantees that LLMs do not generate content out of their intended domain of specialization.