-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Labels
benchmarksFeatures or improvements over our benchmark processesFeatures or improvements over our benchmark processesperformanceFeatures or improvements that positively affect seq-db performanceFeatures or improvements that positively affect seq-db performance
Description
When we process query like foo:'*bar*'
we iterate over all tokens that are stored in fractions and perform substring search using KMP algorithm.
We saw that for some queries findSequence
takes a lot of CPU time.
It seems like for small haystack and needles KMP are wasteful (since the majority of tokens is pretty small but still we need to build LPS - it adds additional resource consumption).
And we think substring search can be improved with bytes.Index
function from Golang standard library because:
- for small haystacks and needles it uses vector instructions (SSE4.2 extension and even AVX2 for some cases);
- it falls back to Rabin-Karp algorithm when dealing with large haystacks and needles (so it's anyway better than stupid bruteforce);
I've already made a PoC and I am ready to provide some results:
pattern · main± ⟩ benchstat kmp.txt bytes-index.txt
goos: linux
goarch: amd64
pkg: github.com/ozontech/seq-db/pattern
cpu: 12th Gen Intel(R) Core(TM) i5-12600K
│ kmp.txt │ bytes-index.txt │
│ sec/op │ sec/op vs base │
FindSequence/tiny-16 61.03n ± 13% 13.30n ± 18% -78.21% (p=0.002 n=6)
FindSequence/small-16 177.40n ± 31% 18.43n ± 55% -89.61% (p=0.002 n=6)
FindSequence/medium-16 626.15n ± 25% 45.43n ± 59% -92.75% (p=0.002 n=6)
FindSequence/large-16 10588.0n ± 5% 542.5n ± 7% -94.88% (p=0.002 n=6)
FindSequence/extra-large-16 633.90µ ± 2% 58.57µ ± 4% -90.76% (p=0.002 n=6)
geomean 2.146µ 204.1n -90.49%
│ kmp.txt │ bytes-index.txt │
│ B/s │ B/s vs base │
FindSequence/tiny-16 1000.4Mi ± 15% 4588.1Mi ± 22% +358.63% (p=0.002 n=6)
FindSequence/small-16 1.344Gi ± 45% 12.938Gi ± 36% +862.37% (p=0.002 n=6)
FindSequence/medium-16 1.524Gi ± 20% 21.430Gi ± 92% +1305.72% (p=0.002 n=6)
FindSequence/large-16 1.441Gi ± 6% 28.125Gi ± 8% +1851.48% (p=0.002 n=6)
FindSequence/extra-large-16 1.541Gi ± 2% 16.675Gi ± 4% +982.38% (p=0.002 n=6)
geomean 1.348Gi 14.23Gi +955.57%
Collected CPU profiles you can check out here:
- KMP - https://tinyurl.com/2eh4ebsx;
bytes.Index
- https://tinyurl.com/25nrwds6;
Warning
And of course we are targeting performance improvements only for x86.
Metadata
Metadata
Assignees
Labels
benchmarksFeatures or improvements over our benchmark processesFeatures or improvements over our benchmark processesperformanceFeatures or improvements that positively affect seq-db performanceFeatures or improvements that positively affect seq-db performance