Prompt2Model automates creation of custom language models for non-experts


Prompt2Model automates the generation of special-purpose NLP models that, in some cases, can outperform GPT-3.5 Turbo while being up to 700 times smaller.

Researchers at Carnegie Mellon University and Tsinghua University have developed a new system called Prompt2Model that can generate custom language models from prompts. The system aims to make the development of specialist AI models accessible to non-experts. Prompt2Model is not meant to be a GPT-4 alternative, but rather an automated pipeline for special-purpose NLP models that perform a particular task very well, are much smaller than large models, and can therefore run locally on weaker hardware.

The system first decomposes the prompt into a structured statement. It then looks for data sets that might be useful for the task at hand and uses OpenAI’s GPT-3.5 Turbo to generate additional synthetic training data tailored to the task. It then identifies an appropriate pre-trained model for fine-tuning the hugging face and trains it on the collected data.

After training, Prompt2Model can create a web interface to interact with the model. The modular design allows customization of each pipeline component.



The team evaluated the results of Prompt2Model in three benchmarks. In two tasks (SQuAD, Temporal), the resulting Flan-T5 models outperformed even GPT-3.5 Turbo, even though the Google model has almost 700 times fewer parameters. In the third benchmark (MCoNaLa) Prompt2Model was clearly behind the OpenAI model.

Prompt2Model has difficulty supporting tasks that require languages other than English, according to the team. The team cited GPT-3.5-Turbo’s limited language support as the reason.

The fact that the team uses the OpenAI model to generate data is also probably Prompt2Model’s biggest limitation, as OpenAI prohibits the use of its own models to train models that might compete with it, making Prompt2Model unusable for commercial applications. However, the team is exploring the integration of large open-source language models to get around the reliance on proprietary APIs.

More information and the code is available on GitHub.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top