Wordsearch Solver
|
Inspired by the struggle for high scores in spelltower, this program solves wordsearches.
Example program, as in example/main.cpp
NOTE: must cd
to build/test and run ./the_test from this dir!
NOTE: requires gnu-gold linker to build with gcc, and lld (llvm linker) to build with clang
Built and tested with gcc8, clang 10 and clang 12
Uses:
NOTE: won't work on Windows because gperftools only supports linux. However this is only used for profiling purposes. Without this, it should work, although may need tweaking, untested.
The normal conan + CMake build process Essentially what's in make.sh
jfrog artifactory hosts conan packages for free
Add my repo with conan, calling it whatever you wish, then consume via conan as normal
Then add to conanfile.txt or conanfile.py, in requires:
And build/use normally your project normally with conan
Sorted vector of strings. Uses binary searches, so O(log(n)) where n is the number of strings for every operation.
Similar to a sorted vector, but std::set, so red-black tree implementation, with slightly worse performance than the vector, I suspect due to worse cache/memory spatial locality. Also O(log(n)) in number of words in dictionary performance.
Recursive tree structure of nodes, where each node holds a vector-like container of edges, and each edge consists of a character and a pointer to the corresponding child node. To lookup a word of length "m", using a dictionary with "d" distinct characters, for example d == 26 for lowercase ascii and the English alphabet, lookup is O(m * d). Realistically, the factor of d will usually be much less than the actual value of d, so really more like just O(m). Could say that furthermore, since (in English at least) average word length is much shorter than max(m) anyway, essentially this becomes almost constant time lookup.
Similar to the trie, but now everything's inline, and the whole thing is in one big contiguous memory block. NOTE: this (currently) only works for lowercase ascii. A node consists of a bitset of size 26 (bits), with each bit representing an edge to a child node and a letter. A node also has a bool to indicate whether it's a word end, and an int to indicate how many nodes before this one on the same row existed, which is required for calculating the offset in the next row of the child nodes. We also keep a track of indexes into said vector of nodes that correspond to each row. A row corresponds to all the letters at that position in a word. Lookup for a word of length "m" is similar to the trie, however now each lookup for every letter of a word, rather than a linear search through a small list of letters, is a lookup for a bit being on in a bitset. Then, assuming that letter is present, read off from the node the offset on the next row, where that letter's next node is found. Lookup for a word of length m is O(m), plus likely better cache locality (unless change trie to use a pool allocator, which should be very possible as can precompute size from input).
In the compact_trie, every node is a fixed size (happens to be 16 bytes currently on my system x64 libstdc++8). There must be at least as many nodes as words, roughly ~115k in the (slightly altered) GNU English dictionary used for this. There may be more, if for example a node is required to represent child nodes, but is not a word itself.
For example, the prefix "abj" is not a word, but would need a node in the compact_trie trie, to allow it to link to other words that have "abj" as a prefix, such as "abject".
The idea with compact_trie2 was to try and optimise this further, to allow for two types of node.
In this trie, there are empty nodes and full ones.
The hope was a tradeoff in saving space would offset the additional lookup complexity in this data structure. A complication that arises is the variable sizes of nodes, offset calcuation is more complex.
The idea with this trie was to have everything inline in one data structure. Unfortunately, performance tends to be only slightly better than the (pointer) trie, and worse than the compact_trie. It has to do more work at each node to calculate the offset to the next one.
Google benchmark the time to solve a wordsearch
My benchmark I use, in bench.sh
Uses the dictionary file that is ~115k lines, and a 100x100 wordsearch, measures the time to solve it
This benchmark was run using clang 12, libstdc++8 and an SSD (Crucial MX500). LTO was used.
Benchmark | Time | CPU | Iterations |
---|---|---|---|
bench_long_words/trie::Trie | 109 ms | 109 ms | 6 |
bench_long_words/compact_trie::CompactTrie | 74.3 ms | 74.3 ms | 9 |
bench_long_words/compact_trie2::CompactTrie2 | 92.1 ms | 92.1 ms | 8 |
bench_long_words/dictionary_std_vector::DictionaryStdVector | 330 ms | 330 ms | 2 |
bench_long_words/dictionary_std_set::DictionaryStdSet | 601 ms | 601 ms | 1 |
Some points to take away:
When run with gcc8 instead of clang12, see much worse performance for the compact tries.
Benchmark | Time | CPU | Iterations |
---|---|---|---|
bench_long_words/trie::Trie | 109 ms | 109 ms | 6 |
bench_long_words/compact_trie::CompactTrie | 127 ms | 127 ms | 6 |
bench_long_words/compact_trie2::CompactTrie2 | 103 ms | 103 ms | 7 |
bench_long_words/dictionary_std_vector::DictionaryStdVector | 360 ms | 360 ms | 2 |
bench_long_words/dictionary_std_set::DictionaryStdSet | 689 ms | 689 ms | 1 |
Cmdline app used for profiling time taken to solve a wordsearch using a particular solver and dictionary. Uses my slightly modified gperftools profiler.
Simple wordsearch gui using ImGui as backend.
Contains the algorithm to actually solver a wordsearch. Exposes types that clients should use to consume this library.