An open-source data generation framework for batch construction of verifiable, controllable, and diverse puzzles.
Important Links: API Docs, Tutorials, Benchmark, Evaluation Toolkit
PuzzleClone is a data synthesis framework and comprehensive dataset for logical reasoning problems. It features:
- ✅ Guaranteed Verifiability: Every problem is generated with a ground-truth solution and is verifiable by a symbolic SMT solver, ensuring correctness.
- 🎯 Granular Control: Offers fine-grained control over problem attributes like scale, structure, and difficulty through a set of adjustable parameters, enabling large-scale batch generation.
- ✨ Flexible Adaptation: Facilitates the easy customization of problem scenarios and translation into different languages or domains.
- 📊 Expansive and Diverse Coverage: Based on PuzzleClone, we have curated a benchmark including 83,657 unique logical reasoning puzzles procedurally generated from 86 seed questions. The dataset spans:
- Various applications of Satisfiability Modulo Theories (SMT) and SMT-like puzzles,
- Classic logical puzzles like Sudoku, the Knapsack problem, and linear optimization (LP).
- Diverse mathematical problems of varying difficulties.
- 🚀 State-of-the-Art Performance: Achieves SOTA results among open-source datasets, outperforming the public dataset by 12.5 points on AMC2023 (from 52.5 to 65.0).
git clone https://github.com/puzzleclone/PuzzleClone.git
cd PuzzleClone
pip install -r requirements.txt
Here are a few common use cases.
This runs the translator in test mode (-t
), generating a sample question based on the specification file.
python translator.py -t path/to/spec.yaml
This runs the translator in production mode (-d
) to generate a large number of problems and saves them to a specified output file (-o
).
python translator.py -d path/to/spec.yaml -o data.jsonl
This uses the -g
flag to load existing problem data and applies a new problem description or template (new_spec.yaml
) to it.
python translator.py -d path/to/new_spec.yaml -g old_data.jsonl -o new_data.jsonl
A handful of scripts are provided to transform the data generated above to the standard formats for a benchmark. Please refer to the scripts and documentation in the data_processing_scripts/
directory.
See our evaluation toolkit PolyhedronEvaluator.
This project is licensed under the Apache 2.0 License. See the LICENSE file for details.