Wordsearch Solver
Public Types | Public Member Functions | Private Member Functions | Private Attributes | Friends | List of all members
compact_trie::CompactTrie Class Reference

Inline contiguous immutable trie. More...

#include <compact_trie.hpp>

Public Types

using Nodes = std::vector< Node >
 
using NodesIterator = std::vector< Node >::const_iterator
 
using Rows = std::vector< NodesIterator >
 
using RowsIterator = Rows::const_iterator
 
using const_iterator = std::tuple< NodesIterator, RowsIterator >
 

Public Member Functions

 CompactTrie (CompactTrie &&)=default
 
CompactTrieoperator= (CompactTrie &&)=default
 
 CompactTrie (const CompactTrie &)=delete
 
CompactTrieoperator= (const CompactTrie &)=delete
 
 CompactTrie (const std::initializer_list< std::string_view > &words)
 
 CompactTrie (const std::initializer_list< std::string > &words)
 
 CompactTrie (const std::initializer_list< const char * > &words)
 
template<class Iterator1 , class Iterator2 >
 CompactTrie (Iterator1 first, const Iterator2 last)
 
template<class Strings >
 CompactTrie (Strings &&strings_in)
 Actual constructor, all other delegate to this.
 
bool contains (std::string_view word) const
 Check if this dictionary contains word. More...
 
bool further (std::string_view word) const
 Check if this dictionary might contain words with word as a prefix. More...
 
template<class OutputIterator >
void contains_further (std::string_view stem, std::string_view suffixes, OutputIterator contains_further_it) const
 For each char in suffix appended to stem, check whether this dictionary contains this word and if it may contain longer words with this prefix. More...
 
std::size_t size () const
 
bool empty () const
 

Private Member Functions

bool contains (std::string_view word, ranges::subrange< NodesIterator > nodes, ranges::subrange< RowsIterator > rows) const
 
bool further (std::string_view word, ranges::subrange< NodesIterator > nodes, ranges::subrange< RowsIterator > rows) const
 
CompactTrie::const_iterator search (std::string_view word, ranges::subrange< CompactTrie::NodesIterator > nodes, ranges::subrange< CompactTrie::RowsIterator > rows) const
 Search as far as possible for word. More...
 

Private Attributes

Nodes nodes_
 
Rows rows_
 
std::size_t size_
 

Friends

std::ostream & operator<< (std::ostream &os, const CompactTrie &ct)
 

Detailed Description

Inline contiguous immutable trie.

Based on the trie::Trie, but now everything's inline, and the whole thing is in one big contiguous memory block.

Note
This (currently) only works for lowercase ascii, as internally a std::bitset<26> is used to represent child nodes. Also, libstdc++ 8 uses a minimum of one unsigned long, 8 bytes, to represent even this, when 4 bytes would do (potential optimisation rolling own bitset etc?).

A node consists of a bitset of size 26 (bits), with each bit representing an edge to a child node and a letter. A node also has a bool to indicate whether it's a word end, and an int to indicate how many nodes before this one on the same row existed, which is required for calculating the offset in the next row of the child nodes. We also keep a track of indexes into said vector of nodes that correspond to each row. A row corresponds to all the letters at that position in a word.

Lookup for a word of length "m" is similar to the trie, however now each lookup for every letter of a word, rather than a linear search through a small list of letters, is a lookup for a bit being on in a bitset. O(n) in the number of child nodes or letters from that stem, to O(1), a simple bitmask.

Then, assuming that letter is present, read off from the node the offset on the next row, where that letter's next node is found. Lookup for a word of length m is O(m), plus likely better cache locality (unless change trie to use a different allocator, which should be possible as can precompute size from input, but then you basically end up making this anyway).

Member Function Documentation

◆ contains()

bool compact_trie::CompactTrie::contains ( std::string_view  word) const

Check if this dictionary contains word.

Parameters
[in]wordThe word to check
Returns
true if word is present, else false

◆ contains_further()

template<class OutputIterator >
void compact_trie::CompactTrie::contains_further ( std::string_view  stem,
std::string_view  suffixes,
OutputIterator  contains_further_it 
) const

For each char in suffix appended to stem, check whether this dictionary contains this word and if it may contain longer words with this prefix.

Parameters
[in]stem
[in]suffixes
[out]contains_further_itThis function is what the solver algorithm calls every iteration to ask the dictionary solver implementation to do its work.

contains_further_it should be assigned to and incremented like an output iterator. The value written should be a std::pair<bool, bool>.

Example contains_further implementation:

const auto* node = this->search(stem);
if (!node) {
return;
}
for (const auto [i, c] : ranges::views::enumerate(suffixes)) {
const std::string_view suffix = {&c, 1};
const auto contains = detail::contains(*node, suffix);
const auto further = detail::further(*node, suffix);
*contains_further_it++ = {contains, further};
}
bool contains(std::string_view word) const
Check if this dictionary contains word.
Definition: compact_trie.cpp:56
CompactTrie::const_iterator search(std::string_view word, ranges::subrange< CompactTrie::NodesIterator > nodes, ranges::subrange< CompactTrie::RowsIterator > rows) const
Search as far as possible for word.
Definition: compact_trie.cpp:133
bool further(std::string_view word) const
Check if this dictionary might contain words with word as a prefix.
Definition: compact_trie.cpp:72

Each character's position in suffix corresponds to the order that that suffix's output should be written to contains_further_it.

suffixes is not guaranteed to be sorted.

◆ further()

bool compact_trie::CompactTrie::further ( std::string_view  word) const

Check if this dictionary might contain words with word as a prefix.

Parameters
[in]wordThe prefix to check
Returns
false if there are no more strings in the dictionary with word as a prefix, true if there might be.
Note
Some solvers do not conclusively know whether, for a prefix word, there are more words that contain that prefix. This is acceptable, though sub-optimal and will result in wasted search time, as long as eventually this returns false, assuming for word there are in fact no more words with that prefix.

◆ search()

CompactTrie::const_iterator compact_trie::CompactTrie::search ( std::string_view  word,
ranges::subrange< CompactTrie::NodesIterator >  nodes,
ranges::subrange< CompactTrie::RowsIterator >  rows 
) const
private

Search as far as possible for word.

Parameters
[in]wordWord to find
[in]nodesRange of nodes in which to search
[in]rowsRange of rows
Returns
An iterator into nodes and an iterator into rows.

Search as far as possible for word in nodes given rows.

Returns {it, rows_it}. If word is found, std::distance(rows.begin(), rows_it) == word.size() Else it will the deepest node/letter present in the dictionary, in word


The documentation for this class was generated from the following files: