Wordsearch Solver
|
Inline contiguous immutable trie. More...
#include <compact_trie.hpp>
Public Types | |
using | Nodes = std::vector< Node > |
using | NodesIterator = std::vector< Node >::const_iterator |
using | Rows = std::vector< NodesIterator > |
using | RowsIterator = Rows::const_iterator |
using | const_iterator = std::tuple< NodesIterator, RowsIterator > |
Public Member Functions | |
CompactTrie (CompactTrie &&)=default | |
CompactTrie & | operator= (CompactTrie &&)=default |
CompactTrie (const CompactTrie &)=delete | |
CompactTrie & | operator= (const CompactTrie &)=delete |
CompactTrie (const std::initializer_list< std::string_view > &words) | |
CompactTrie (const std::initializer_list< std::string > &words) | |
CompactTrie (const std::initializer_list< const char * > &words) | |
template<class Iterator1 , class Iterator2 > | |
CompactTrie (Iterator1 first, const Iterator2 last) | |
template<class Strings > | |
CompactTrie (Strings &&strings_in) | |
Actual constructor, all other delegate to this. | |
bool | contains (std::string_view word) const |
Check if this dictionary contains word . More... | |
bool | further (std::string_view word) const |
Check if this dictionary might contain words with word as a prefix. More... | |
template<class OutputIterator > | |
void | contains_further (std::string_view stem, std::string_view suffixes, OutputIterator contains_further_it) const |
For each char in suffix appended to stem, check whether this dictionary contains this word and if it may contain longer words with this prefix. More... | |
std::size_t | size () const |
bool | empty () const |
Private Member Functions | |
bool | contains (std::string_view word, ranges::subrange< NodesIterator > nodes, ranges::subrange< RowsIterator > rows) const |
bool | further (std::string_view word, ranges::subrange< NodesIterator > nodes, ranges::subrange< RowsIterator > rows) const |
CompactTrie::const_iterator | search (std::string_view word, ranges::subrange< CompactTrie::NodesIterator > nodes, ranges::subrange< CompactTrie::RowsIterator > rows) const |
Search as far as possible for word . More... | |
Private Attributes | |
Nodes | nodes_ |
Rows | rows_ |
std::size_t | size_ |
Friends | |
std::ostream & | operator<< (std::ostream &os, const CompactTrie &ct) |
Inline contiguous immutable trie.
Based on the trie::Trie, but now everything's inline, and the whole thing is in one big contiguous memory block.
std::bitset<26>
is used to represent child nodes. Also, libstdc++ 8 uses a minimum of one unsigned long
, 8 bytes, to represent even this, when 4 bytes would do (potential optimisation rolling own bitset etc?).A node consists of a bitset of size 26 (bits), with each bit representing an edge to a child node and a letter. A node also has a bool to indicate whether it's a word end, and an int to indicate how many nodes before this one on the same row existed, which is required for calculating the offset in the next row of the child nodes. We also keep a track of indexes into said vector of nodes that correspond to each row. A row corresponds to all the letters at that position in a word.
Lookup for a word of length "m" is similar to the trie, however now each lookup for every letter of a word, rather than a linear search through a small list of letters, is a lookup for a bit being on in a bitset. O(n) in the number of child nodes or letters from that stem, to O(1), a simple bitmask.
Then, assuming that letter is present, read off from the node the offset on the next row, where that letter's next node is found. Lookup for a word of length m is O(m), plus likely better cache locality (unless change trie to use a different allocator, which should be possible as can precompute size from input, but then you basically end up making this anyway).
bool compact_trie::CompactTrie::contains | ( | std::string_view | word | ) | const |
Check if this dictionary contains word
.
[in] | word | The word to check |
true
if word
is present, else false
void compact_trie::CompactTrie::contains_further | ( | std::string_view | stem, |
std::string_view | suffixes, | ||
OutputIterator | contains_further_it | ||
) | const |
For each char in suffix appended to stem, check whether this dictionary contains this word and if it may contain longer words with this prefix.
[in] | stem | |
[in] | suffixes | |
[out] | contains_further_it | This function is what the solver algorithm calls every iteration to ask the dictionary solver implementation to do its work. |
contains_further_it
should be assigned to and incremented like an output iterator. The value written should be a std::pair<bool, bool>.
Example contains_further implementation:
Each character's position in suffix
corresponds to the order that that suffix's
output should be written to contains_further_it
.
suffixes
is not guaranteed to be sorted.
bool compact_trie::CompactTrie::further | ( | std::string_view | word | ) | const |
Check if this dictionary might contain words with word
as a prefix.
[in] | word | The prefix to check |
false
if there are no more strings in the dictionary with word
as a prefix, true
if there might be.word
, there are more words that contain that prefix. This is acceptable, though sub-optimal and will result in wasted search time, as long as eventually this returns false, assuming for word
there are in fact no more words with that prefix.
|
private |
Search as far as possible for word
.
[in] | word | Word to find |
[in] | nodes | Range of nodes in which to search |
[in] | rows | Range of rows |
nodes
and an iterator into rows
.Search as far as possible for word
in nodes
given rows
.
Returns {it, rows_it}
. If word
is found, std::distance(rows.begin(), rows_it) == word.size()
Else it will the deepest node/letter present in the dictionary, in word