The Bamboogle Dataset

Bamboogle is a dataset that we constructed, made up only of questions that Google answers incorrectly. The leaderboard for it is here.

In our Compositionality Gap paper, we show that language models also struggle with these questions and that our self-ask prompting method substantially improves the ability of language models to answer these questions (better than Chain-of-Thought).

For more details, check out the video above.

Bamboogle was introduced in our Compositionality Gap paper which can be found here, and the dataset itself is here.

Written on October 18, 2022