## Software Quality - [ ] Refactor Code #9 - [ ] Simple way to add new model ## Implementation - [x] Support FlashAttention - [x] Support Sampling - [ ] Support Batch>1 - [ ] Lookahead window KV-Cache (May hurt accuracy) - [ ] Verification branch trie ## New Models - [ ] Baichuan #11 - [ ] QWen #22