algorithm - tipos - que es un arreglo en c++

Cómo encontrar una lista de palabras posibles de una matriz de letras (30)

Últimamente he estado jugando un juego en mi iPhone llamado Scramble. Algunos de ustedes pueden conocer este juego como Boggle. Esencialmente, cuando el juego comienza, obtienes una matriz de letras así:

F X I E A M L O E W B X A S T U

El objetivo del juego es encontrar tantas palabras como puedas que puedan formarse encadenando letras. Puedes comenzar con cualquier letra, y todas las letras que lo rodean son un juego justo, y luego, una vez que pasas a la siguiente letra, todas las letras que rodean a esa letra son juegos justos, excepto las letras utilizadas anteriormente . Así que en la cuadrícula de arriba, por ejemplo, podría encontrar las palabras LOB , TUX , SEA , FAME , etc. Las palabras deben tener al menos 3 caracteres, y no más de NxN, que serían 16 en este juego pero pueden Variar en algunas implementaciones. Si bien este juego es divertido y adictivo, aparentemente no soy muy bueno en eso y quería hacer un poco de trampa haciendo un programa que me diera las mejores palabras posibles (cuanto más larga sea la palabra, más puntos obtendrás).

Ejemplo de Boggle http://www.boggled.org/sample.gif

Lamentablemente, no soy muy bueno con los algoritmos o sus eficiencias, etc. Mi primer intento usa un diccionario como este (~ 2.3MB) y hace una búsqueda lineal tratando de combinar combinaciones con las entradas del diccionario. Esto toma mucho tiempo para encontrar las palabras posibles, y como solo obtiene 2 minutos por ronda, simplemente no es adecuado.

Estoy interesado en ver si algún Stackoverflowers puede ofrecer soluciones más eficientes. Principalmente estoy buscando soluciones que usen Big 3 Ps: Python, PHP y Perl, aunque cualquier cosa con Java o C ++ también es genial, ya que la velocidad es esencial.

SOLUCIONES ACTUALES :

Adam Rosenfield, Python, ~ 20s
John Fouhy, Python, ~ 3s
Kent Fredric, Perl, ~ 1s
Darius Bacon, Python, ~ 1s
rvarcher, VB.NET (enlace directo) , ~ 1s
Paolo Bergantino, PHP (enlace directo) , ~ 5s (~ 2s localmente)

BOUNTY :

Estoy agregando una recompensa a esta pregunta como forma de agradecer a todas las personas que colaboraron con sus programas. Desafortunadamente, solo puedo dar la respuesta aceptada a uno de ustedes, así que mediré quién tiene el solucionador de boggle más rápido dentro de 7 días y otorgaré la recompensa al ganador.

Recompensa otorgada. Gracias a todos los que participaron.

La solución más rápida que obtendrá probablemente implicará almacenar su diccionario en un momento. Luego, cree una cola de tripletes ( x , y , s ), donde cada elemento de la cola corresponda a un prefijo s de una palabra que se puede escribir en la cuadrícula, que termina en la ubicación ( x , y ). Inicialice la cola con N x N elementos (donde N es el tamaño de su cuadrícula), un elemento para cada cuadrado de la cuadrícula. Entonces, el algoritmo procede de la siguiente manera:

While the queue is not empty: Dequeue a triple (x, y, s) For each square (x'', y'') with letter c adjacent to (x, y): If s+c is a word, output s+c If s+c is a prefix of a word, insert (x'', y'', s+c) into the queue

Si almacena su diccionario en una lista, puede hacer una prueba de si s + c es una palabra o el prefijo de una palabra en un tiempo constante (siempre que también mantenga algunos metadatos adicionales en cada dato de la cola, como un puntero al nodo actual). en el trie), por lo que el tiempo de ejecución de este algoritmo es O (número de palabras que se pueden escribir).

[Editar] Aquí hay una implementación en Python que acabo de codificar:

#!/usr/bin/python class TrieNode: def __init__(self, parent, value): self.parent = parent self.children = [None] * 26 self.isWord = False if parent is not None: parent.children[ord(value) - 97] = self def MakeTrie(dictfile): dict = open(dictfile) root = TrieNode(None, '''') for word in dict: curNode = root for letter in word.lower(): if 97 <= ord(letter) < 123: nextNode = curNode.children[ord(letter) - 97] if nextNode is None: nextNode = TrieNode(curNode, letter) curNode = nextNode curNode.isWord = True return root def BoggleWords(grid, dict): rows = len(grid) cols = len(grid[0]) queue = [] words = [] for y in range(cols): for x in range(rows): c = grid[y][x] node = dict.children[ord(c) - 97] if node is not None: queue.append((x, y, c, node)) while queue: x, y, s, node = queue[0] del queue[0] for dx, dy in ((1, 0), (1, -1), (0, -1), (-1, -1), (-1, 0), (-1, 1), (0, 1), (1, 1)): x2, y2 = x + dx, y + dy if 0 <= x2 < cols and 0 <= y2 < rows: s2 = s + grid[y2][x2] node2 = node.children[ord(grid[y2][x2]) - 97] if node2 is not None: if node2.isWord: words.append(s2) queue.append((x2, y2, s2, node2)) return words

Ejemplo de uso:

d = MakeTrie(''/usr/share/dict/words'') print(BoggleWords([''fxie'',''amlo'',''ewbx'',''astu''], d))

Salida:

[''fa'', ''xi'', ''ie'', ''io'', ''el'', ''am'', ''ax'', ''ae'', ''aw'', ''mi'', ''ma'', ''me'', '' lo '','' li '','' oe '','' ox '','' em '','' ea '','' ea '','' es '','' wa '','' we '','' wa '','' bo '','' bu '' , ''como'', ''aw'', ''ae'', ''st'', ''se'', ''sa'', ''tu'', ''ut'', ''fam'', ''fae'', ''imi'', ''eli'', '' elm '','' elb '','' ami '','' ama '','' ame '','' aes '','' awl '','' awa '','' awe '','' awa '','' mix '','' mim '','' mil '' , ''mam'', ''max'', ''mae'', ''maw'', ''mew'', ''mem'', ''mes'', ''lob'', ''lox'', ''lei'', ''leo'', ''lie'', '' lim '','' oil '','' olm '','' ewe '','' eme '','' wax '','' waf '','' wae '','' waw '','' wem '','' wea '','' wea '','' was '' , ''waw'', ''wae'', ''bob'', ''blo'', ''bub'', ''but'', ''ast'', ''ase'', ''asa'', ''awl'', ''awa'', ''maravilla'', '' awa '','' aes '','' swa '','' swa '','' coser '','' mar '','' mar '','' saw '','' tux '','' tub '','' tut '','' twa '','' twa '' , ''tst'', ''utu'', ''fama'', ''fame'', ''ixil'', ''imam'', ''amli'', ''amil'', ''ambo'', ''axil'', ''axle'', ''mimi'', '' mima '','' mime '','' milo '','' milla '','' mewl '','' mese '','' mesa '','' lolo '','' lobo '','' lima '','' lima '','' limb '','' lile '' , ''oime'', ''oleo'', ''olio'', ''oboe'', ''obol'', ''emim'', ''emil'', ''east'', ''ease'', ''wame'', ''wawa'', ''wawa'', '' weam '','' west '','' wese '','' wast '','' wase '' , ''wawa'', ''wawa'', ''hervir'', ''bolo'', ''bole'', ''bobo'', ''blob'', ''bleo'', ''bubo'', ''asem'', ''stub'', ''stut'', '' swam '','' semi '','' seme '','' seam '','' seax '','' sasa '','' sawt '','' tutu '','' tuts '','' twae '','' twas '','' twae '','' ilima '' , ''amble'', ''axile'', ''awest'', ''mamie'', ''mambo'', ''maxim'', ''mease'', ''mesem'', ''limax'', ''limes'', ''limbo'', ''limbu'', '' obole '','' emesa '','' embox '','' awest '','' swami '','' famble '','' mimble '','' maxima '','' embolo '','' embole '','' wamble '','' semese '','' semble '' , ''sawbwa'', ''sawbwa'']

Notas: este programa no genera palabras de 1 letra, o filtra por longitud de palabra en absoluto. Es fácil de agregar pero no es realmente relevante para el problema. También genera algunas palabras varias veces si se pueden deletrear de varias maneras. Si una palabra dada se puede deletrear de muchas maneras diferentes (en el peor de los casos: todas las letras de la cuadrícula son las mismas (por ejemplo, ''A'') y una palabra como ''aaaaaaaaaa'' está en su diccionario), entonces el tiempo de ejecución será horriblemente exponencial . Filtrar los duplicados y clasificarlos es trivial después de que el algoritmo haya terminado.

Mi intento en Java. Se necesitan aproximadamente 2 s para leer el archivo y crear el trie, y alrededor de 50 ms para resolver el rompecabezas. Utilicé el diccionario vinculado en la pregunta (tiene algunas palabras que no sabía que existían en inglés como fae, ima)

0 [main] INFO gineer.bogglesolver.util.Util - Reading the dictionary 2234 [main] INFO gineer.bogglesolver.util.Util - Finish reading the dictionary 2234 [main] INFO gineer.bogglesolver.Solver - Found: FAM 2234 [main] INFO gineer.bogglesolver.Solver - Found: FAME 2234 [main] INFO gineer.bogglesolver.Solver - Found: FAMBLE 2234 [main] INFO gineer.bogglesolver.Solver - Found: FAE 2234 [main] INFO gineer.bogglesolver.Solver - Found: IMA 2234 [main] INFO gineer.bogglesolver.Solver - Found: ELI 2234 [main] INFO gineer.bogglesolver.Solver - Found: ELM 2234 [main] INFO gineer.bogglesolver.Solver - Found: ELB 2234 [main] INFO gineer.bogglesolver.Solver - Found: AXIL 2234 [main] INFO gineer.bogglesolver.Solver - Found: AXILE 2234 [main] INFO gineer.bogglesolver.Solver - Found: AXLE 2234 [main] INFO gineer.bogglesolver.Solver - Found: AMI 2234 [main] INFO gineer.bogglesolver.Solver - Found: AMIL 2234 [main] INFO gineer.bogglesolver.Solver - Found: AMLI 2234 [main] INFO gineer.bogglesolver.Solver - Found: AME 2234 [main] INFO gineer.bogglesolver.Solver - Found: AMBLE 2234 [main] INFO gineer.bogglesolver.Solver - Found: AMBO 2250 [main] INFO gineer.bogglesolver.Solver - Found: AES 2250 [main] INFO gineer.bogglesolver.Solver - Found: AWL 2250 [main] INFO gineer.bogglesolver.Solver - Found: AWE 2250 [main] INFO gineer.bogglesolver.Solver - Found: AWEST 2250 [main] INFO gineer.bogglesolver.Solver - Found: AWA 2250 [main] INFO gineer.bogglesolver.Solver - Found: MIX 2250 [main] INFO gineer.bogglesolver.Solver - Found: MIL 2250 [main] INFO gineer.bogglesolver.Solver - Found: MILE 2250 [main] INFO gineer.bogglesolver.Solver - Found: MILO 2250 [main] INFO gineer.bogglesolver.Solver - Found: MAX 2250 [main] INFO gineer.bogglesolver.Solver - Found: MAE 2250 [main] INFO gineer.bogglesolver.Solver - Found: MAW 2250 [main] INFO gineer.bogglesolver.Solver - Found: MEW 2250 [main] INFO gineer.bogglesolver.Solver - Found: MEWL 2250 [main] INFO gineer.bogglesolver.Solver - Found: MES 2250 [main] INFO gineer.bogglesolver.Solver - Found: MESA 2250 [main] INFO gineer.bogglesolver.Solver - Found: MWA 2250 [main] INFO gineer.bogglesolver.Solver - Found: MWA 2250 [main] INFO gineer.bogglesolver.Solver - Found: LIE 2250 [main] INFO gineer.bogglesolver.Solver - Found: LIM 2250 [main] INFO gineer.bogglesolver.Solver - Found: LIMA 2250 [main] INFO gineer.bogglesolver.Solver - Found: LIMAX 2250 [main] INFO gineer.bogglesolver.Solver - Found: LIME 2250 [main] INFO gineer.bogglesolver.Solver - Found: LIMES 2250 [main] INFO gineer.bogglesolver.Solver - Found: LIMB 2250 [main] INFO gineer.bogglesolver.Solver - Found: LIMBO 2250 [main] INFO gineer.bogglesolver.Solver - Found: LIMBU 2250 [main] INFO gineer.bogglesolver.Solver - Found: LEI 2250 [main] INFO gineer.bogglesolver.Solver - Found: LEO 2250 [main] INFO gineer.bogglesolver.Solver - Found: LOB 2250 [main] INFO gineer.bogglesolver.Solver - Found: LOX 2250 [main] INFO gineer.bogglesolver.Solver - Found: OIME 2250 [main] INFO gineer.bogglesolver.Solver - Found: OIL 2250 [main] INFO gineer.bogglesolver.Solver - Found: OLE 2250 [main] INFO gineer.bogglesolver.Solver - Found: OLM 2250 [main] INFO gineer.bogglesolver.Solver - Found: EMIL 2250 [main] INFO gineer.bogglesolver.Solver - Found: EMBOLE 2250 [main] INFO gineer.bogglesolver.Solver - Found: EMBOX 2250 [main] INFO gineer.bogglesolver.Solver - Found: EAST 2250 [main] INFO gineer.bogglesolver.Solver - Found: WAF 2250 [main] INFO gineer.bogglesolver.Solver - Found: WAX 2250 [main] INFO gineer.bogglesolver.Solver - Found: WAME 2250 [main] INFO gineer.bogglesolver.Solver - Found: WAMBLE 2250 [main] INFO gineer.bogglesolver.Solver - Found: WAE 2250 [main] INFO gineer.bogglesolver.Solver - Found: WEA 2250 [main] INFO gineer.bogglesolver.Solver - Found: WEAM 2250 [main] INFO gineer.bogglesolver.Solver - Found: WEM 2250 [main] INFO gineer.bogglesolver.Solver - Found: WEA 2250 [main] INFO gineer.bogglesolver.Solver - Found: WES 2250 [main] INFO gineer.bogglesolver.Solver - Found: WEST 2250 [main] INFO gineer.bogglesolver.Solver - Found: WAE 2250 [main] INFO gineer.bogglesolver.Solver - Found: WAS 2250 [main] INFO gineer.bogglesolver.Solver - Found: WASE 2250 [main] INFO gineer.bogglesolver.Solver - Found: WAST 2250 [main] INFO gineer.bogglesolver.Solver - Found: BLEO 2250 [main] INFO gineer.bogglesolver.Solver - Found: BLO 2250 [main] INFO gineer.bogglesolver.Solver - Found: BOIL 2250 [main] INFO gineer.bogglesolver.Solver - Found: BOLE 2250 [main] INFO gineer.bogglesolver.Solver - Found: BUT 2250 [main] INFO gineer.bogglesolver.Solver - Found: AES 2250 [main] INFO gineer.bogglesolver.Solver - Found: AWA 2250 [main] INFO gineer.bogglesolver.Solver - Found: AWL 2250 [main] INFO gineer.bogglesolver.Solver - Found: AWE 2250 [main] INFO gineer.bogglesolver.Solver - Found: AWEST 2250 [main] INFO gineer.bogglesolver.Solver - Found: ASE 2250 [main] INFO gineer.bogglesolver.Solver - Found: ASEM 2250 [main] INFO gineer.bogglesolver.Solver - Found: AST 2250 [main] INFO gineer.bogglesolver.Solver - Found: SEA 2250 [main] INFO gineer.bogglesolver.Solver - Found: SEAX 2250 [main] INFO gineer.bogglesolver.Solver - Found: SEAM 2250 [main] INFO gineer.bogglesolver.Solver - Found: SEMI 2250 [main] INFO gineer.bogglesolver.Solver - Found: SEMBLE 2250 [main] INFO gineer.bogglesolver.Solver - Found: SEW 2250 [main] INFO gineer.bogglesolver.Solver - Found: SEA 2250 [main] INFO gineer.bogglesolver.Solver - Found: SWA 2250 [main] INFO gineer.bogglesolver.Solver - Found: SWAM 2250 [main] INFO gineer.bogglesolver.Solver - Found: SWAMI 2250 [main] INFO gineer.bogglesolver.Solver - Found: SWA 2250 [main] INFO gineer.bogglesolver.Solver - Found: SAW 2250 [main] INFO gineer.bogglesolver.Solver - Found: SAWT 2250 [main] INFO gineer.bogglesolver.Solver - Found: STU 2250 [main] INFO gineer.bogglesolver.Solver - Found: STUB 2250 [main] INFO gineer.bogglesolver.Solver - Found: TWA 2250 [main] INFO gineer.bogglesolver.Solver - Found: TWAE 2250 [main] INFO gineer.bogglesolver.Solver - Found: TWA 2250 [main] INFO gineer.bogglesolver.Solver - Found: TWAE 2250 [main] INFO gineer.bogglesolver.Solver - Found: TWAS 2250 [main] INFO gineer.bogglesolver.Solver - Found: TUB 2250 [main] INFO gineer.bogglesolver.Solver - Found: TUX

El código fuente consta de 6 clases. Los publicaré a continuación (si esta no es la práctica correcta en , por favor, dígame).

gineer.bogglesolver.Main

package gineer.bogglesolver; import org.apache.log4j.BasicConfigurator; import org.apache.log4j.Logger; public class Main { private final static Logger logger = Logger.getLogger(Main.class); public static void main(String[] args) { BasicConfigurator.configure(); Solver solver = new Solver(4, "FXIE" + "AMLO" + "EWBX" + "ASTU"); solver.solve(); } }

gineer.bogglesolver.Solver

package gineer.bogglesolver; import gineer.bogglesolver.trie.Trie; import gineer.bogglesolver.util.Constants; import gineer.bogglesolver.util.Util; import org.apache.log4j.Logger; public class Solver { private char[] puzzle; private int maxSize; private boolean[] used; private StringBuilder stringSoFar; private boolean[][] matrix; private Trie trie; private final static Logger logger = Logger.getLogger(Solver.class); public Solver(int size, String puzzle) { trie = Util.getTrie(size); matrix = Util.connectivityMatrix(size); maxSize = size * size; stringSoFar = new StringBuilder(maxSize); used = new boolean[maxSize]; if (puzzle.length() == maxSize) { this.puzzle = puzzle.toCharArray(); } else { logger.error("The puzzle size does not match the size specified: " + puzzle.length()); this.puzzle = puzzle.substring(0, maxSize).toCharArray(); } } public void solve() { for (int i = 0; i < maxSize; i++) { traverseAt(i); } } private void traverseAt(int origin) { stringSoFar.append(puzzle[origin]); used[origin] = true; //Check if we have a valid word if ((stringSoFar.length() >= Constants.MINIMUM_WORD_LENGTH) && (trie.containKey(stringSoFar.toString()))) { logger.info("Found: " + stringSoFar.toString()); } //Find where to go next for (int destination = 0; destination < maxSize; destination++) { if (matrix[origin][destination] && !used[destination] && trie.containPrefix(stringSoFar.toString() + puzzle[destination])) { traverseAt(destination); } } used[origin] = false; stringSoFar.deleteCharAt(stringSoFar.length() - 1); } }

gineer.bogglesolver.trie.Node

package gineer.bogglesolver.trie; import gineer.bogglesolver.util.Constants; class Node { Node[] children; boolean isKey; public Node() { isKey = false; children = new Node[Constants.NUMBER_LETTERS_IN_ALPHABET]; } public Node(boolean key) { isKey = key; children = new Node[Constants.NUMBER_LETTERS_IN_ALPHABET]; } /** Method to insert a string to Node and its children @param key the string to insert (the string is assumed to be uppercase) @return true if the node or one of its children is changed, false otherwise */ public boolean insert(String key) { //If the key is empty, this node is a key if (key.length() == 0) { if (isKey) return false; else { isKey = true; return true; } } else {//otherwise, insert in one of its child int childNodePosition = key.charAt(0) - Constants.LETTER_A; if (children[childNodePosition] == null) { children[childNodePosition] = new Node(); children[childNodePosition].insert(key.substring(1)); return true; } else { return children[childNodePosition].insert(key.substring(1)); } } } /** Returns whether key is a valid prefix for certain key in this trie. For example: if key "hello" is in this trie, tests with all prefixes "hel", "hell", "hello" return true @param prefix the prefix to check @return true if the prefix is valid, false otherwise */ public boolean containPrefix(String prefix) { //If the prefix is empty, return true if (prefix.length() == 0) { return true; } else {//otherwise, check in one of its child int childNodePosition = prefix.charAt(0) - Constants.LETTER_A; return children[childNodePosition] != null && children[childNodePosition].containPrefix(prefix.substring(1)); } } /** Returns whether key is a valid key in this trie. For example: if key "hello" is in this trie, tests with all prefixes "hel", "hell" return false @param key the key to check @return true if the key is valid, false otherwise */ public boolean containKey(String key) { //If the prefix is empty, return true if (key.length() == 0) { return isKey; } else {//otherwise, check in one of its child int childNodePosition = key.charAt(0) - Constants.LETTER_A; return children[childNodePosition] != null && children[childNodePosition].containKey(key.substring(1)); } } public boolean isKey() { return isKey; } public void setKey(boolean key) { isKey = key; } }

gineer.bogglesolver.trie.Trie

package gineer.bogglesolver.trie; public class Trie { Node root; public Trie() { this.root = new Node(); } /** Method to insert a string to Node and its children @param key the string to insert (the string is assumed to be uppercase) @return true if the node or one of its children is changed, false otherwise */ public boolean insert(String key) { return root.insert(key.toUpperCase()); } /** Returns whether key is a valid prefix for certain key in this trie. For example: if key "hello" is in this trie, tests with all prefixes "hel", "hell", "hello" return true @param prefix the prefix to check @return true if the prefix is valid, false otherwise */ public boolean containPrefix(String prefix) { return root.containPrefix(prefix.toUpperCase()); } /** Returns whether key is a valid key in this trie. For example: if key "hello" is in this trie, tests with all prefixes "hel", "hell" return false @param key the key to check @return true if the key is valid, false otherwise */ public boolean containKey(String key) { return root.containKey(key.toUpperCase()); } }

gineer.bogglesolver.util.Constants

package gineer.bogglesolver.util; public class Constants { public static final int NUMBER_LETTERS_IN_ALPHABET = 26; public static final char LETTER_A = ''A''; public static final int MINIMUM_WORD_LENGTH = 3; public static final int DEFAULT_PUZZLE_SIZE = 4; }

gineer.bogglesolver.util.Util

package gineer.bogglesolver.util; import gineer.bogglesolver.trie.Trie; import org.apache.log4j.Logger; import java.io.File; import java.io.FileNotFoundException; import java.util.Scanner; public class Util { private final static Logger logger = Logger.getLogger(Util.class); private static Trie trie; private static int size = Constants.DEFAULT_PUZZLE_SIZE; /** Returns the trie built from the dictionary. The size is used to eliminate words that are too long. @param size the size of puzzle. The maximum lenght of words in the returned trie is (size * size) @return the trie that can be used for puzzle of that size */ public static Trie getTrie(int size) { if ((trie != null) && size == Util.size) return trie; trie = new Trie(); Util.size = size; logger.info("Reading the dictionary"); final File file = new File("dictionary.txt"); try { Scanner scanner = new Scanner(file); final int maxSize = size * size; while (scanner.hasNext()) { String line = scanner.nextLine().replaceAll("[^//p{Alpha}]", ""); if (line.length() <= maxSize) trie.insert(line); } } catch (FileNotFoundException e) { logger.error("Cannot open file", e); } logger.info("Finish reading the dictionary"); return trie; } static boolean[] connectivityRow(int x, int y, int size) { boolean[] squares = new boolean[size * size]; for (int offsetX = -1; offsetX <= 1; offsetX++) { for (int offsetY = -1; offsetY <= 1; offsetY++) { final int calX = x + offsetX; final int calY = y + offsetY; if ((calX >= 0) && (calX < size) && (calY >= 0) && (calY < size)) squares[calY * size + calX] = true; } } squares[y * size + x] = false;//the current x, y is false return squares; } /** Returns the matrix of connectivity between two points. Point i can go to point j iff matrix[i][j] is true Square (x, y) is equivalent to point (size * y + x). For example, square (1,1) is point 5 in a puzzle of size 4 @param size the size of the puzzle @return the connectivity matrix */ public static boolean[][] connectivityMatrix(int size) { boolean[][] matrix = new boolean[size * size][]; for (int x = 0; x < size; x++) { for (int y = 0; y < size; y++) { matrix[y * size + x] = connectivityRow(x, y, size); } } return matrix; } }

Mi respuesta funciona como las demás aquí, pero la publicaré porque parece un poco más rápida que las otras soluciones de Python, desde la configuración más rápida del diccionario. (Verifiqué esto con la solución de John Fouhy.) Después de la configuración, el tiempo para resolver está en el ruido.

grid = "fxie amlo ewbx astu".split() nrows, ncols = len(grid), len(grid[0]) # A dictionary word that could be a solution must use only the grid''s # letters and have length >= 3. (With a case-insensitive match.) import re alphabet = ''''.join(set(''''.join(grid))) bogglable = re.compile(''['' + alphabet + '']{3,}$'', re.I).match words = set(word.rstrip(''/n'') for word in open(''words'') if bogglable(word)) prefixes = set(word[:i] for word in words for i in range(2, len(word)+1)) def solve(): for y, row in enumerate(grid): for x, letter in enumerate(row): for result in extending(letter, ((x, y),)): yield result def extending(prefix, path): if prefix in words: yield (prefix, path) for (nx, ny) in neighbors(path[-1]): if (nx, ny) not in path: prefix1 = prefix + grid[ny][nx] if prefix1 in prefixes: for result in extending(prefix1, path + ((nx, ny),)): yield result def neighbors((x, y)): for nx in range(max(0, x-1), min(x+2, ncols)): for ny in range(max(0, y-1), min(y+2, nrows)): yield (nx, ny)

Uso de la muestra:

# Print a maximal-length word and its path: print max(solve(), key=lambda (word, path): len(word))

Editar: filtrar las palabras de menos de 3 letras de largo.

Edición 2: tenía curiosidad por qué la solución Perl de Kent Fredric era más rápida; Resulta utilizar la coincidencia de expresiones regulares en lugar de un conjunto de caracteres. Hacer lo mismo en Python duplica la velocidad.

Para una aceleración del diccionario, hay un proceso / transformación general que puede hacer para reducir en gran medida las comparaciones del diccionario antes de tiempo.

Dado que la cuadrícula anterior solo contiene 16 caracteres, algunos de ellos duplicados, puede reducir considerablemente el número total de claves en su diccionario simplemente filtrando las entradas que tienen caracteres inalcanzables.

Pensé que esta era la optimización obvia, pero como no lo hizo nadie, lo menciono.

Me redujo de un diccionario de 200,000 claves a solo 2,000 claves simplemente durante el pase de entrada. Esto al menos reduce la sobrecarga de memoria, y eso seguramente se asignará a un aumento de velocidad en algún lugar, ya que la memoria no es infinitamente rápida.

Implementación Perl

Mi implementación es un poco pesada porque le doy importancia a poder conocer la ruta exacta de cada cadena extraída, no solo la validez en ella.

También tengo algunas adaptaciones allí que teóricamente permitirían que funcionen una cuadrícula con orificios, y cuadrículas con líneas de diferentes tamaños (asumiendo que se obtiene la entrada correcta y se alinea de alguna manera).

El filtro inicial es, con mucho, el cuello de botella más importante en mi aplicación, como se sospechaba anteriormente, al comentar que la línea lo infla de 1.5 a 7.5 segundos.

Al ejecutarse, parece pensar que todos los dígitos individuales tienen sus propias palabras válidas, pero estoy bastante seguro de que eso se debe a cómo funciona el archivo de diccionario.

Está un poco hinchado, pero al menos reutilizo Tree::Trie de cpan

Parte de esto se inspiró parcialmente en las implementaciones existentes, parte de la cual ya tenía en mente.

La crítica constructiva y las formas en que podría mejorarse son bienvenidas (/ me dice que nunca buscó en CPAN un solucionador de boggle , pero fue más divertido de resolver)

actualizado para nuevos criterios

#!/usr/bin/perl use strict; use warnings; { # this package manages a given path through the grid. # Its an array of matrix-nodes in-order with # Convenience functions for pretty-printing the paths # and for extending paths as new paths. # Usage: # my $p = Prefix->new(path=>[ $startnode ]); # my $c = $p->child( $extensionNode ); # print $c->current_word ; package Prefix; use Moose; has path => ( isa => ''ArrayRef[MatrixNode]'', is => ''rw'', default => sub { [] }, ); has current_word => ( isa => ''Str'', is => ''rw'', lazy_build => 1, ); # Create a clone of this object # with a longer path # $o->child( $successive-node-on-graph ); sub child { my $self = shift; my $newNode = shift; my $f = Prefix->new(); # Have to do this manually or other recorded paths get modified push @{ $f->{path} }, @{ $self->{path} }, $newNode; return $f; } # Traverses $o->path left-to-right to get the string it represents. sub _build_current_word { my $self = shift; return join q{}, map { $_->{value} } @{ $self->{path} }; } # Returns the rightmost node on this path sub tail { my $self = shift; return $self->{path}->[-1]; } # pretty-format $o->path sub pp_path { my $self = shift; my @path = map { ''['' . $_->{x_position} . '','' . $_->{y_position} . '']'' } @{ $self->{path} }; return "[" . join( ",", @path ) . "]"; } # pretty-format $o sub pp { my $self = shift; return $self->current_word . '' => '' . $self->pp_path; } __PACKAGE__->meta->make_immutable; } { # Basic package for tracking node data # without having to look on the grid. # I could have just used an array or a hash, but that got ugly. # Once the matrix is up and running it doesn''t really care so much about rows/columns, # Its just a sea of points and each point has adjacent points. # Relative positioning is only really useful to map it back to userspace package MatrixNode; use Moose; has x_position => ( isa => ''Int'', is => ''rw'', required => 1 ); has y_position => ( isa => ''Int'', is => ''rw'', required => 1 ); has value => ( isa => ''Str'', is => ''rw'', required => 1 ); has siblings => ( isa => ''ArrayRef[MatrixNode]'', is => ''rw'', default => sub { [] } ); # Its not implicitly uni-directional joins. It would be more effient in therory # to make the link go both ways at the same time, but thats too hard to program around. # and besides, this isn''t slow enough to bother caring about. sub add_sibling { my $self = shift; my $sibling = shift; push @{ $self->siblings }, $sibling; } # Convenience method to derive a path starting at this node sub to_path { my $self = shift; return Prefix->new( path => [$self] ); } __PACKAGE__->meta->make_immutable; } { package Matrix; use Moose; has rows => ( isa => ''ArrayRef'', is => ''rw'', default => sub { [] }, ); has regex => ( isa => ''Regexp'', is => ''rw'', lazy_build => 1, ); has cells => ( isa => ''ArrayRef'', is => ''rw'', lazy_build => 1, ); sub add_row { my $self = shift; push @{ $self->rows }, [@_]; } # Most of these functions from here down are just builder functions, # or utilities to help build things. # Some just broken out to make it easier for me to process. # All thats really useful is add_row # The rest will generally be computed, stored, and ready to go # from ->cells by the time either ->cells or ->regex are called. # traverse all cells and make a regex that covers them. sub _build_regex { my $self = shift; my $chars = q{}; for my $cell ( @{ $self->cells } ) { $chars .= $cell->value(); } $chars = "[^$chars]"; return qr/$chars/i; } # convert a plain cell ( ie: [x][y] = 0 ) # to an intelligent cell ie: [x][y] = object( x, y ) # we only really keep them in this format temporarily # so we can go through and tie in neighbouring information. # after the neigbouring is done, the grid should be considered inoperative. sub _convert { my $self = shift; my $x = shift; my $y = shift; my $v = $self->_read( $x, $y ); my $n = MatrixNode->new( x_position => $x, y_position => $y, value => $v, ); $self->_write( $x, $y, $n ); return $n; } # go through the rows/collums presently available and freeze them into objects. sub _build_cells { my $self = shift; my @out = (); my @rows = @{ $self->{rows} }; for my $x ( 0 .. $#rows ) { next unless defined $self->{rows}->[$x]; my @col = @{ $self->{rows}->[$x] }; for my $y ( 0 .. $#col ) { next unless defined $self->{rows}->[$x]->[$y]; push @out, $self->_convert( $x, $y ); } } for my $c (@out) { for my $n ( $self->_neighbours( $c->x_position, $c->y_position ) ) { $c->add_sibling( $self->{rows}->[ $n->[0] ]->[ $n->[1] ] ); } } return /@out; } # given x,y , return array of points that refer to valid neighbours. sub _neighbours { my $self = shift; my $x = shift; my $y = shift; my @out = (); for my $sx ( -1, 0, 1 ) { next if $sx + $x < 0; next if not defined $self->{rows}->[ $sx + $x ]; for my $sy ( -1, 0, 1 ) { next if $sx == 0 && $sy == 0; next if $sy + $y < 0; next if not defined $self->{rows}->[ $sx + $x ]->[ $sy + $y ]; push @out, [ $sx + $x, $sy + $y ]; } } return @out; } sub _has_row { my $self = shift; my $x = shift; return defined $self->{rows}->[$x]; } sub _has_cell { my $self = shift; my $x = shift; my $y = shift; return defined $self->{rows}->[$x]->[$y]; } sub _read { my $self = shift; my $x = shift; my $y = shift; return $self->{rows}->[$x]->[$y]; } sub _write { my $self = shift; my $x = shift; my $y = shift; my $v = shift; $self->{rows}->[$x]->[$y] = $v; return $v; } __PACKAGE__->meta->make_immutable; } use Tree::Trie; sub readDict { my $fn = shift; my $re = shift; my $d = Tree::Trie->new(); # Dictionary Loading open my $fh, ''<'', $fn; while ( my $line = <$fh> ) { chomp($line); # Commenting the next line makes it go from 1.5 seconds to 7.5 seconds. EPIC. next if $line =~ $re; # Early Filter $d->add( uc($line) ); } return $d; } sub traverseGraph { my $d = shift; my $m = shift; my $min = shift; my $max = shift; my @words = (); # Inject all grid nodes into the processing queue. my @queue = grep { $d->lookup( $_->current_word ) } map { $_->to_path } @{ $m->cells }; while (@queue) { my $item = shift @queue; # put the dictionary into "exact match" mode. $d->deepsearch(''exact''); my $cword = $item->current_word; my $l = length($cword); if ( $l >= $min && $d->lookup($cword) ) { push @words, $item; # push current path into "words" if it exactly matches. } next if $l > $max; # put the dictionary into "is-a-prefix" mode. $d->deepsearch(''boolean''); siblingloop: foreach my $sibling ( @{ $item->tail->siblings } ) { foreach my $visited ( @{ $item->{path} } ) { next siblingloop if $sibling == $visited; } # given path y , iterate for all its end points my $subpath = $item->child($sibling); # create a new path for each end-point if ( $d->lookup( $subpath->current_word ) ) { # if the new path is a prefix, add it to the bottom of the queue. push @queue, $subpath; } } } return /@words; } sub setup_predetermined { my $m = shift; my $gameNo = shift; if( $gameNo == 0 ){ $m->add_row(qw( F X I E )); $m->add_row(qw( A M L O )); $m->add_row(qw( E W B X )); $m->add_row(qw( A S T U )); return $m; } if( $gameNo == 1 ){ $m->add_row(qw( D G H I )); $m->add_row(qw( K L P S )); $m->add_row(qw( Y E U T )); $m->add_row(qw( E O R N )); return $m; } } sub setup_random { my $m = shift; my $seed = shift; srand $seed; my @letters = ''A'' .. ''Z'' ; for( 1 .. 4 ){ my @r = (); for( 1 .. 4 ){ push @r , $letters[int(rand(25))]; } $m->add_row( @r ); } } # Here is where the real work starts. my $m = Matrix->new(); setup_predetermined( $m, 0 ); #setup_random( $m, 5 ); my $d = readDict( ''dict.txt'', $m->regex ); my $c = scalar @{ $m->cells }; # get the max, as per spec print join ",/n", map { $_->pp } @{ traverseGraph( $d, $m, 3, $c ) ; };

Información de arco / ejecución para comparación:

model name : Intel(R) Core(TM)2 Duo CPU T9300 @ 2.50GHz cache size : 6144 KB Memory usage summary: heap total: 77057577, heap peak: 11446200, stack peak: 26448 total calls total memory failed calls malloc| 947212 68763684 0 realloc| 11191 1045641 0 (nomove:9063, dec:4731, free:0) calloc| 121001 7248252 0 free| 973159 65854762 Histogram for block sizes: 0-15 392633 36% ================================================== 16-31 43530 4% ===== 32-47 50048 4% ====== 48-63 70701 6% ========= 64-79 18831 1% == 80-95 19271 1% == 96-111 238398 22% ============================== 112-127 3007 <1% 128-143 236727 21% ==============================

Más murmullos sobre la optimización de Regex

La optimización de expresiones regulares que utilizo es inútil para los diccionarios de resolución múltiple, y para la resolución múltiple querrá un diccionario completo, no uno recortado previamente.

Sin embargo, dicho esto, para soluciones únicas, es realmente rápido. (Perex regex están en C! :))

Aquí hay algunas adiciones de códigos diferentes:

sub readDict_nofilter { my $fn = shift; my $re = shift; my $d = Tree::Trie->new(); # Dictionary Loading open my $fh, ''<'', $fn; while ( my $line = <$fh> ) { chomp($line); $d->add( uc($line) ); } return $d; } sub benchmark_io { use Benchmark qw( cmpthese :hireswallclock ); # generate a random 16 character string # to simulate there being an input grid. my $regexen = sub { my @letters = ''A'' .. ''Z'' ; my @lo = (); for( 1..16 ){ push @lo , $_ ; } my $c = join '''', @lo; $c = "[^$c]"; return qr/$c/i; }; cmpthese( 200 , { filtered => sub { readDict(''dict.txt'', $regexen->() ); }, unfiltered => sub { readDict_nofilter(''dict.txt''); } }); }

s/iter unfiltered filtered unfiltered 8.16 -- -94% filtered 0.464 1658% --

_{ps: 8.16 * 200 = 27 minutos.}

Podrías dividir el problema en dos partes:

Algún tipo de algoritmo de búsqueda que enumere las posibles cadenas en la cuadrícula.
Una forma de probar si una cadena es una palabra válida.

Lo ideal es que (2) también incluya una forma de probar si una cadena es un prefijo de una palabra válida; esto le permitirá eliminar su búsqueda y ahorrar un montón de tiempo.

Trie de Adam Rosenfield es una solución para (2). Es elegante y probablemente lo que prefieren los especialistas en algoritmos, pero con los idiomas modernos y las computadoras modernas, podemos ser un poco más perezosos. Además, como sugiere Kent, podemos reducir el tamaño de nuestro diccionario descartando las palabras que tienen letras que no están presentes en la cuadrícula. Aquí hay algo de python:

def make_lookups(grid, fn=''dict.txt''): # Make set of valid characters. chars = set() for word in grid: chars.update(word) words = set(x.strip() for x in open(fn) if set(x.strip()) <= chars) prefixes = set() for w in words: for i in range(len(w)+1): prefixes.add(w[:i]) return words, prefixes

Guau; Prueba de prefijo de tiempo constante. Se tarda unos segundos en cargar el diccionario que ha vinculado, pero solo un par :-) (observe que las words <= prefixes )

Ahora, para la parte (1), me inclino a pensar en términos de gráficos. Así que construiré un diccionario que se parece a esto:

graph = { (x, y):set([(x0,y0), (x1,y1), (x2,y2)]), }

es decir, graph[(x, y)] es el conjunto de coordenadas que puede alcanzar desde la posición (x, y) . También agregaré un nodo falso None que se conectará a todo.

Construir es un poco torpe, porque hay 8 posiciones posibles y tienes que hacer controles de límites. Aquí hay un código de Python correspondientemente torpe:

def make_graph(grid): root = None graph = { root:set() } chardict = { root:'''' } for i, row in enumerate(grid): for j, char in enumerate(row): chardict[(i, j)] = char node = (i, j) children = set() graph[node] = children graph[root].add(node) add_children(node, children, grid) return graph, chardict def add_children(node, children, grid): x0, y0 = node for i in [-1,0,1]: x = x0 + i if not (0 <= x < len(grid)): continue for j in [-1,0,1]: y = y0 + j if not (0 <= y < len(grid[0])) or (i == j == 0): continue children.add((x,y))

Este código también genera una asignación de diccionario (x,y) al carácter correspondiente. Esto me permite convertir una lista de posiciones en una palabra:

def to_word(chardict, pos_list): return ''''.join(chardict[x] for x in pos_list)

Finalmente, hacemos una búsqueda en profundidad. El procedimiento básico es:

La búsqueda llega a un nodo particular.
Compruebe si el camino hasta ahora podría ser parte de una palabra. Si no, no explore más esta rama.
Compruebe si el camino hasta ahora es una palabra. Si es así, agregue a la lista de resultados.
Explora a todos los niños que no forman parte del camino hasta ahora.

Pitón:

def find_words(graph, chardict, position, prefix, results, words, prefixes): """ Arguments: graph :: mapping (x,y) to set of reachable positions chardict :: mapping (x,y) to character position :: current position (x,y) -- equals prefix[-1] prefix :: list of positions in current string results :: set of words found words :: set of valid words in the dictionary prefixes :: set of valid words or prefixes thereof """ word = to_word(chardict, prefix) if word not in prefixes: return if word in words: results.add(word) for child in graph[position]: if child not in prefix: find_words(graph, chardict, child, prefix+[child], results, words, prefixes)

Ejecuta el código como:

grid = [''fxie'', ''amlo'', ''ewbx'', ''astu''] g, c = make_graph(grid) w, p = make_lookups(grid) res = set() find_words(g, c, None, [], res, w, p)

e inspeccionar res para ver las respuestas. Aquí hay una lista de palabras encontradas para su ejemplo, ordenadas por tamaño:

[''a'', ''b'', ''e'', ''f'', ''i'', ''l'', ''m'', ''o'', ''s'', ''t'', ''u'', ''w'', ''x'', ''ae'', ''am'', ''as'', ''aw'', ''ax'', ''bo'', ''bu'', ''ea'', ''el'', ''em'', ''es'', ''fa'', ''ie'', ''io'', ''li'', ''lo'', ''ma'', ''me'', ''mi'', ''oe'', ''ox'', ''sa'', ''se'', ''st'', ''tu'', ''ut'', ''wa'', ''we'', ''xi'', ''aes'', ''ame'', ''ami'', ''ase'', ''ast'', ''awa'', ''awe'', ''awl'', ''blo'', ''but'', ''elb'', ''elm'', ''fae'', ''fam'', ''lei'', ''lie'', ''lim'', ''lob'', ''lox'', ''mae'', ''maw'', ''mew'', ''mil'', ''mix'', ''oil'', ''olm'', ''saw'', ''sea'', ''sew'', ''swa'', ''tub'', ''tux'', ''twa'', ''wae'', ''was'', ''wax'', ''wem'', ''ambo'', ''amil'', ''amli'', ''asem'', ''axil'', ''axle'', ''bleo'', ''boil'', ''bole'', ''east'', ''fame'', ''limb'', ''lime'', ''mesa'', ''mewl'', ''mile'', ''milo'', ''oime'', ''sawt'', ''seam'', ''seax'', ''semi'', ''stub'', ''swam'', ''twae'', ''twas'', ''wame'', ''wase'', ''wast'', ''weam'', ''west'', ''amble'', ''awest'', ''axile'', ''embox'', ''limbo'', ''limes'', ''swami'', ''embole'', ''famble'', ''semble'', ''wamble'']

El código tarda (literalmente) un par de segundos en cargar el diccionario, pero el resto es instantáneo en mi máquina.

¿Su algoritmo de búsqueda disminuye continuamente la lista de palabras a medida que su búsqueda continúa?

Por ejemplo, en la búsqueda anterior solo hay 13 letras con las que pueden comenzar sus palabras (reduciendo efectivamente a la mitad las letras iniciales).

A medida que agregue más permutaciones de letras, disminuirá aún más los conjuntos de palabras disponibles, disminuyendo la búsqueda necesaria.

Yo empezaría allí.

Aquí está la solución. Uso de palabras predefinidas en el kit de herramientas NLTK. NLTK tiene el paquete nltk.corpus en el que tenemos el paquete llamado palabras y contiene más de 2Lakhs palabras en inglés que puede usar simplemente en su programa.

Una vez que haya creado su matriz, conviértala en una matriz de caracteres y realice este código.

import nltk from nltk.corpus import words from collections import Counter def possibleWords(input, charSet): for word in input: dict = Counter(word) flag = 1 for key in dict.keys(): if key not in charSet: flag = 0 if flag == 1 and len(word)>5: #its depends if you want only length more than 5 use this otherwise remove that one. print(word) nltk.download(''words'') word_list = words.words() # prints 236736 print(len(word_list)) charSet = [''h'', ''e'', ''l'', ''o'', ''n'', ''v'', ''t''] possibleWords(word_list, charSet)

Salida:

eleven eleventh elevon entente entone ethene ethenol evolve evolvent hellhole helvell hooven letten looten nettle nonene nonent nonlevel notelet novelet novelette novene teenet teethe teevee telethon tellee tenent tentlet theelol toetoe tonlet toothlet tootle tottle vellon velvet velveteen venene vennel venthole voeten volent volvelle volvent voteen

Espero que lo obtengas.

Dado un tablero Boggle con N filas y M columnas, asumamos lo siguiente:

N * M es sustancialmente mayor que el número de palabras posibles
N * M es sustancialmente mayor que la palabra más larga posible

Bajo estas suposiciones, la complejidad de esta solución es O (N * M).

Creo que comparar los tiempos de ejecución de esta placa de ejemplo de muchas maneras no tiene sentido, pero, para completar, esta solución se completa en <0.2s en mi MacBook Pro moderno.

Esta solución encontrará todos los caminos posibles para cada palabra en el corpus.

#!/usr/bin/env ruby # Example usage: ./boggle-solver --board "fxie amlo ewbx astu" autoload :Matrix, ''matrix'' autoload :OptionParser, ''optparse'' DEFAULT_CORPUS_PATH = ''/usr/share/dict/words''.freeze # Functions def filter_corpus(matrix, corpus, min_word_length) board_char_counts = Hash.new(0) matrix.each { |c| board_char_counts[c] += 1 } max_word_length = matrix.row_count * matrix.column_count boggleable_regex = /^[#{board_char_counts.keys.reduce(:+)}]{#{min_word_length},#{max_word_length}}$/ corpus.select{ |w| w.match boggleable_regex }.select do |w| word_char_counts = Hash.new(0) w.each_char { |c| word_char_counts[c] += 1 } word_char_counts.all? { |c, count| board_char_counts[c] >= count } end end def neighbors(point, matrix) i, j = point ([i-1, 0].max .. [i+1, matrix.row_count-1].min).inject([]) do |r, new_i| ([j-1, 0].max .. [j+1, matrix.column_count-1].min).inject(r) do |r, new_j| neighbor = [new_i, new_j] neighbor.eql?(point) ? r : r << neighbor end end end def expand_path(path, word, matrix) return [path] if path.length == word.length next_char = word[path.length] viable_neighbors = neighbors(path[-1], matrix).select do |point| !path.include?(point) && matrix.element(*point).eql?(next_char) end viable_neighbors.inject([]) do |result, point| result + expand_path(path.dup << point, word, matrix) end end def find_paths(word, matrix) result = [] matrix.each_with_index do |c, i, j| result += expand_path([[i, j]], word, matrix) if c.eql?(word[0]) end result end def solve(matrix, corpus, min_word_length: 3) boggleable_corpus = filter_corpus(matrix, corpus, min_word_length) boggleable_corpus.inject({}) do |result, w| paths = find_paths(w, matrix) result[w] = paths unless paths.empty? result end end # Script options = { corpus_path: DEFAULT_CORPUS_PATH } option_parser = OptionParser.new do |opts| opts.banner = ''Usage: boggle-solver --board <value> [--corpus <value>]'' opts.on(''--board BOARD'', String, ''The board (e.g. "fxi aml ewb ast")'') do |b| options[:board] = b end opts.on(''--corpus CORPUS_PATH'', String, ''Corpus file path'') do |c| options[:corpus_path] = c end opts.on_tail(''-h'', ''--help'', ''Shows usage'') do STDOUT.puts opts exit end end option_parser.parse! unless options[:board] STDERR.puts option_parser exit false end unless File.file? options[:corpus_path] STDERR.puts "No corpus exists - #{options[:corpus_path]}" exit false end rows = options[:board].downcase.scan(//S+/).map{ |row| row.scan(/./) } raw_corpus = File.readlines(options[:corpus_path]) corpus = raw_corpus.map{ |w| w.downcase.rstrip }.uniq.sort solution = solve(Matrix.rows(rows), corpus) solution.each_pair do |w, paths| STDOUT.puts w paths.each do |path| STDOUT.puts "/t" + path.map{ |point| point.inspect }.join('', '') end end STDOUT.puts "TOTAL: #{solution.count}"

Me doy cuenta de que el tiempo de esta pregunta ha llegado y se ha ido, pero como yo mismo estaba trabajando en un solucionador y tropecé con esto mientras buscaba en Google, pensé que debería publicar una referencia a la mía ya que parece un poco diferente de algunas otras.

Elegí ir con una matriz plana para el tablero de juego y hacer búsquedas recursivas de cada letra en el tablero, pasando de un vecino válido a un vecino válido, extendiendo la búsqueda si la lista actual de letras es un prefijo válido en un índice. Mientras atraviesa la noción de la palabra actual, hay una lista de índices en el tablero, no letras que forman una palabra. Al verificar el índice, los índices se traducen en letras y se realiza la comprobación.

El índice es un diccionario de fuerza bruta que se parece un poco a un trío, pero permite realizar consultas Pythonic del índice. Si las palabras "cat" y "cater" están en la lista, obtendrás esto en el diccionario:

d = { ''c'': [''cat'',''cater''], ''ca'': [''cat'',''cater''], ''cat'': [''cat'',''cater''], ''cate'': [''cater''], ''cater'': [''cater''], }

Entonces, si current_word es ''ca'', sabes que es un prefijo válido porque ''ca'' in ddevuelve True (así que continúa el recorrido de la placa). Y si current_word es ''cat'', entonces sabes que es una palabra válida porque es un prefijo válido y ''cat'' in d[''cat'']devuelve True también.

Si se siente así, esto permite un código legible que no parece demasiado lento. Como todos los demás, el gasto en este sistema es leer / construir el índice. Resolver el tablero es bastante ruido.

El código está en http://gist.github.com/268079 . Es intencionalmente vertical e ingenuo con mucha comprobación de validez explícita porque quería entender el problema sin arrugarlo con un montón de magia u oscuridad.

Pasé 3 meses trabajando en una solución al problema de las 10 mejores tablas de 5 x 5 Boggle de punto denso.

El problema ahora está resuelto y se presenta con la divulgación completa en 5 páginas web. Por favor, contácteme con preguntas.

El algoritmo de análisis de la placa utiliza una pila explícita para atravesar los cuadrados de la placa de forma pseudo-recursiva a través de un gráfico de palabras acíclico dirigido con información directa del niño y un mecanismo de seguimiento de marca de tiempo. Esta puede muy bien ser la estructura de datos de léxico más avanzada del mundo.

El esquema evalúa unos 10,000 tableros muy buenos por segundo en un quad core. (9500+ puntos)

Página web para padres:

DeepSearch.c - http://www.pathcom.com/~vadco/deep.html

Páginas web de componentes:

Cuadro de indicadores óptimo: http://www.pathcom.com/~vadco/binary.html

Estructura Lexicon avanzada - http://www.pathcom.com/~vadco/adtdawg.html

Algoritmo de análisis de tablero: http://www.pathcom.com/~vadco/guns.html

Procesamiento en lotes en paralelo: http://www.pathcom.com/~vadco/parallel.html

- Este cuerpo de trabajo integral solo interesará a una persona que exige lo mejor.

Sorprendentemente, nadie intentó una versión PHP de esto.

Esta es una versión de trabajo de PHP de la solución Python de John Fouhy.

Aunque tomé algunos consejos de las respuestas de todos los demás, esto se copia principalmente de John.

$boggle = "fxie amlo ewbx astu"; $alphabet = str_split(str_replace(array("/n", " ", "/r"), "", strtolower($boggle))); $rows = array_map(''trim'', explode("/n", $boggle)); $dictionary = file("C:/dict.txt"); $prefixes = array(''''=>''''); $words = array(); $regex = ''/['' . implode('''', $alphabet) . '']{3,}$/S''; foreach($dictionary as $k=>$value) { $value = trim(strtolower($value)); $length = strlen($value); if(preg_match($regex, $value)) { for($x = 0; $x < $length; $x++) { $letter = substr($value, 0, $x+1); if($letter == $value) { $words[$value] = 1; } else { $prefixes[$letter] = 1; } } } } $graph = array(); $chardict = array(); $positions = array(); $c = count($rows); for($i = 0; $i < $c; $i++) { $l = strlen($rows[$i]); for($j = 0; $j < $l; $j++) { $chardict[$i.'',''.$j] = $rows[$i][$j]; $children = array(); $pos = array(-1,0,1); foreach($pos as $z) { $xCoord = $z + $i; if($xCoord < 0 || $xCoord >= count($rows)) { continue; } $len = strlen($rows[0]); foreach($pos as $w) { $yCoord = $j + $w; if(($yCoord < 0 || $yCoord >= $len) || ($z == 0 && $w == 0)) { continue; } $children[] = array($xCoord, $yCoord); } } $graph[''None''][] = array($i, $j); $graph[$i.'',''.$j] = $children; } } function to_word($chardict, $prefix) { $word = array(); foreach($prefix as $v) { $word[] = $chardict[$v[0].'',''.$v[1]]; } return implode("", $word); } function find_words($graph, $chardict, $position, $prefix, $prefixes, &$results, $words) { $word = to_word($chardict, $prefix); if(!isset($prefixes[$word])) return false; if(isset($words[$word])) { $results[] = $word; } foreach($graph[$position] as $child) { if(!in_array($child, $prefix)) { $newprefix = $prefix; $newprefix[] = $child; find_words($graph, $chardict, $child[0].'',''.$child[1], $newprefix, $prefixes, $results, $words); } } } $solution = array(); find_words($graph, $chardict, ''None'', array(), $prefixes, $solution); print_r($solution);

Aquí hay un enlace en vivo si quieres probarlo. Aunque toma ~ 2s en mi máquina local, toma ~ 5s en mi servidor web. En cualquier caso, no es muy rápido. Aún así, sin embargo, es bastante horrible, así que puedo imaginar que el tiempo puede reducirse significativamente. Cualquier sugerencia sobre cómo lograr eso sería apreciada. La falta de tuplas de PHP hizo que las coordenadas fueran extrañas para trabajar y mi incapacidad para comprender qué demonios está sucediendo no ayudó en absoluto.

EDITAR : Algunas correcciones hacen que tome menos de 1 s localmente.

¿No te interesa VB? :) No pude resistirme. He resuelto esto de manera diferente a muchas de las soluciones presentadas aquí.

Mis tiempos son:

Cargando el diccionario y los prefijos de palabras en una tabla hash: .5 a 1 segundos.
Encontrar las palabras: un promedio de menos de 10 milisegundos.

EDITAR: los tiempos de carga del diccionario en el servidor de alojamiento web se ejecutan entre 1 y 1,5 segundos más que la computadora de mi hogar.

No sé qué tan mal se deteriorarán los tiempos con una carga en el servidor.

Escribí mi solución como una página web en .Net. myvrad.com/boggle

Estoy usando el diccionario al que se hace referencia en la pregunta original.

Las letras no se reutilizan en una palabra. Sólo se encuentran palabras de 3 caracteres o más.

Estoy usando una tabla hash de todos los prefijos de palabras únicas y palabras en lugar de un trie. No sabía nada de Trie, así que aprendí algo allí. La idea de crear una lista de prefijos de palabras además de las palabras completas es lo que finalmente hizo que mis tiempos se redujeran a un número respetable.

Lea los comentarios del código para detalles adicionales.

Aquí está el código:

Imports System.Collections.Generic Imports System.IO Partial Class boggle_Default ''Bob Archer, 4/15/2009 ''To avoid using a 2 dimensional array in VB I''m not using typical X,Y ''coordinate iteration to find paths. '' ''I have locked the code into a 4 by 4 grid laid out like so: '' abcd '' efgh '' ijkl '' mnop '' ''To find paths the code starts with a letter from a to p then ''explores the paths available around it. If a neighboring letter ''already exists in the path then we don''t go there. '' ''Neighboring letters (grid points) are hard coded into ''a Generic.Dictionary below. ''Paths is a list of only valid Paths found. ''If a word prefix or word is not found the path is not ''added and extending that path is terminated. Dim Paths As New Generic.List(Of String) ''NeighborsOf. The keys are the letters a to p. ''The value is a string of letters representing neighboring letters. ''The string of neighboring letters is split and iterated later. Dim NeigborsOf As New Generic.Dictionary(Of String, String) ''BoggleLetters. The keys are mapped to the lettered grid of a to p. ''The values are what the user inputs on the page. Dim BoggleLetters As New Generic.Dictionary(Of String, String) ''Used to store last postition of path. This will be a letter ''from a to p. Dim LastPositionOfPath As String = "" ''I found a HashTable was by far faster than a Generic.Dictionary '' - about 10 times faster. This stores prefixes of words and words. ''I determined 792773 was the number of words and unique prefixes that ''will be generated from the dictionary file. This is a max number and ''the final hashtable will not have that many. Dim HashTableOfPrefixesAndWords As New Hashtable(792773) ''Stores words that are found. Dim FoundWords As New Generic.List(Of String) ''Just to validate what the user enters in the grid. Dim ErrorFoundWithSubmittedLetters As Boolean = False Public Sub BuildAndTestPathsAndFindWords(ByVal ThisPath As String) ''Word is the word correlating to the ThisPath parameter. ''This path would be a series of letters from a to p. Dim Word As String = "" ''The path is iterated through and a word based on the actual ''letters in the Boggle grid is assembled. For i As Integer = 0 To ThisPath.Length - 1 Word += Me.BoggleLetters(ThisPath.Substring(i, 1)) Next ''If my hashtable of word prefixes and words doesn''t contain this Word ''Then this isn''t a word and any further extension of ThisPath will not ''yield any words either. So exit sub to terminate exploring this path. If Not HashTableOfPrefixesAndWords.ContainsKey(Word) Then Exit Sub ''The value of my hashtable is a boolean representing if the key if a word (true) or ''just a prefix (false). If true and at least 3 letters long then yay! word found. If HashTableOfPrefixesAndWords(Word) AndAlso Word.Length > 2 Then Me.FoundWords.Add(Word) ''If my List of Paths doesn''t contain ThisPath then add it. ''Remember only valid paths will make it this far. Paths not found ''in the HashTableOfPrefixesAndWords cause this sub to exit above. If Not Paths.Contains(ThisPath) Then Paths.Add(ThisPath) ''Examine the last letter of ThisPath. We are looking to extend the path ''to our neighboring letters if any are still available. LastPositionOfPath = ThisPath.Substring(ThisPath.Length - 1, 1) ''Loop through my list of neighboring letters (representing grid points). For Each Neighbor As String In Me.NeigborsOf(LastPositionOfPath).ToCharArray() ''If I find a neighboring grid point that I haven''t already used ''in ThisPath then extend ThisPath and feed the new path into ''this recursive function. (see recursive.) If Not ThisPath.Contains(Neighbor) Then Me.BuildAndTestPathsAndFindWords(ThisPath & Neighbor) Next End Sub Protected Sub ButtonBoggle_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles ButtonBoggle.Click ''User has entered the 16 letters and clicked the go button. ''Set up my Generic.Dictionary of grid points, I''m using letters a to p - ''not an x,y grid system. The values are neighboring points. NeigborsOf.Add("a", "bfe") NeigborsOf.Add("b", "cgfea") NeigborsOf.Add("c", "dhgfb") NeigborsOf.Add("d", "hgc") NeigborsOf.Add("e", "abfji") NeigborsOf.Add("f", "abcgkjie") NeigborsOf.Add("g", "bcdhlkjf") NeigborsOf.Add("h", "cdlkg") NeigborsOf.Add("i", "efjnm") NeigborsOf.Add("j", "efgkonmi") NeigborsOf.Add("k", "fghlponj") NeigborsOf.Add("l", "ghpok") NeigborsOf.Add("m", "ijn") NeigborsOf.Add("n", "ijkom") NeigborsOf.Add("o", "jklpn") NeigborsOf.Add("p", "klo") ''Retrieve letters the user entered. BoggleLetters.Add("a", Me.TextBox1.Text.ToLower.Trim()) BoggleLetters.Add("b", Me.TextBox2.Text.ToLower.Trim()) BoggleLetters.Add("c", Me.TextBox3.Text.ToLower.Trim()) BoggleLetters.Add("d", Me.TextBox4.Text.ToLower.Trim()) BoggleLetters.Add("e", Me.TextBox5.Text.ToLower.Trim()) BoggleLetters.Add("f", Me.TextBox6.Text.ToLower.Trim()) BoggleLetters.Add("g", Me.TextBox7.Text.ToLower.Trim()) BoggleLetters.Add("h", Me.TextBox8.Text.ToLower.Trim()) BoggleLetters.Add("i", Me.TextBox9.Text.ToLower.Trim()) BoggleLetters.Add("j", Me.TextBox10.Text.ToLower.Trim()) BoggleLetters.Add("k", Me.TextBox11.Text.ToLower.Trim()) BoggleLetters.Add("l", Me.TextBox12.Text.ToLower.Trim()) BoggleLetters.Add("m", Me.TextBox13.Text.ToLower.Trim()) BoggleLetters.Add("n", Me.TextBox14.Text.ToLower.Trim()) BoggleLetters.Add("o", Me.TextBox15.Text.ToLower.Trim()) BoggleLetters.Add("p", Me.TextBox16.Text.ToLower.Trim()) ''Validate user entered something with a length of 1 for all 16 textboxes. For Each S As String In BoggleLetters.Keys If BoggleLetters(S).Length <> 1 Then ErrorFoundWithSubmittedLetters = True Exit For End If Next ''If input is not valid then... If ErrorFoundWithSubmittedLetters Then ''Present error message. Else ''Else assume we have 16 letters to work with and start finding words. Dim SB As New StringBuilder Dim Time As String = String.Format("{0}:{1}:{2}:{3}", Date.Now.Hour.ToString(), Date.Now.Minute.ToString(), Date.Now.Second.ToString(), Date.Now.Millisecond.ToString()) Dim NumOfLetters As Integer = 0 Dim Word As String = "" Dim TempWord As String = "" Dim Letter As String = "" Dim fr As StreamReader = Nothing fr = New System.IO.StreamReader(HttpContext.Current.Request.MapPath("~/boggle/dic.txt")) ''First fill my hashtable with word prefixes and words. ''HashTable(PrefixOrWordString, BooleanTrueIfWordFalseIfPrefix) While fr.Peek <> -1 Word = fr.ReadLine.Trim() TempWord = "" For i As Integer = 0 To Word.Length - 1 Letter = Word.Substring(i, 1) ''This optimization helped quite a bit. Words in the dictionary that begin ''with letters that the user did not enter in the grid shouldn''t go in my hashtable. '' ''I realize most of the solutions went with a Trie. I''d never heard of that before, ''which is one of the neat things about SO, seeing how others approach challenges ''and learning some best practices. '' ''However, I didn''t code a Trie in my solution. I just have a hashtable with ''all words in the dicitonary file and all possible prefixes for those words. ''A Trie might be faster but I''m not coding it now. I''m getting good times with this. If i = 0 AndAlso Not BoggleLetters.ContainsValue(Letter) Then Continue While TempWord += Letter If Not HashTableOfPrefixesAndWords.ContainsKey(TempWord) Then HashTableOfPrefixesAndWords.Add(TempWord, TempWord = Word) End If Next End While SB.Append("Number of Word Prefixes and Words in Hashtable: " & HashTableOfPrefixesAndWords.Count.ToString()) SB.Append(" ") SB.Append("Loading Dictionary: " & Time & " - " & String.Format("{0}:{1}:{2}:{3}", Date.Now.Hour.ToString(), Date.Now.Minute.ToString(), Date.Now.Second.ToString(), Date.Now.Millisecond.ToString())) SB.Append(" ") Time = String.Format("{0}:{1}:{2}:{3}", Date.Now.Hour.ToString(), Date.Now.Minute.ToString(), Date.Now.Second.ToString(), Date.Now.Millisecond.ToString()) ''This starts a path at each point on the grid an builds a path until ''the string of letters correlating to the path is not found in the hashtable ''of word prefixes and words. Me.BuildAndTestPathsAndFindWords("a") Me.BuildAndTestPathsAndFindWords("b") Me.BuildAndTestPathsAndFindWords("c") Me.BuildAndTestPathsAndFindWords("d") Me.BuildAndTestPathsAndFindWords("e") Me.BuildAndTestPathsAndFindWords("f") Me.BuildAndTestPathsAndFindWords("g") Me.BuildAndTestPathsAndFindWords("h") Me.BuildAndTestPathsAndFindWords("i") Me.BuildAndTestPathsAndFindWords("j") Me.BuildAndTestPathsAndFindWords("k") Me.BuildAndTestPathsAndFindWords("l") Me.BuildAndTestPathsAndFindWords("m") Me.BuildAndTestPathsAndFindWords("n") Me.BuildAndTestPathsAndFindWords("o") Me.BuildAndTestPathsAndFindWords("p") SB.Append("Finding Words: " & Time & " - " & String.Format("{0}:{1}:{2}:{3}", Date.Now.Hour.ToString(), Date.Now.Minute.ToString(), Date.Now.Second.ToString(), Date.Now.Millisecond.ToString())) SB.Append(" ") SB.Append("Num of words found: " & FoundWords.Count.ToString()) SB.Append(" ") SB.Append(" ") FoundWords.Sort() SB.Append(String.Join(" ", FoundWords.ToArray())) ''Output results. Me.LiteralBoggleResults.Text = SB.ToString() Me.PanelBoggleResults.Visible = True End If End Sub End Class

¿Qué tal una ordenación simple y el uso de la búsqueda binaria en el diccionario?

Devuelve la lista completa en 0.35 segundos y puede optimizarse aún más (por ejemplo, eliminando palabras con letras no utilizadas, etc.).

from bisect import bisect_left f = open("dict.txt") D.extend([line.strip() for line in f.readlines()]) D = sorted(D) def neibs(M,x,y): n = len(M) for i in xrange(-1,2): for j in xrange(-1,2): if (i == 0 and j == 0) or (x + i < 0 or x + i >= n or y + j < 0 or y + j >= n): continue yield (x + i, y + j) def findWords(M,D,x,y,prefix): prefix = prefix + M[x][y] # find word in dict by binary search found = bisect_left(D,prefix) # if found then yield if D[found] == prefix: yield prefix # if what we found is not even a prefix then return # (there is no point in going further) if len(D[found]) < len(prefix) or D[found][:len(prefix)] != prefix: return # recourse for neib in neibs(M,x,y): for word in findWords(M,D,neib[0], neib[1], prefix): yield word def solve(M,D): # check each starting point for x in xrange(0,len(M)): for y in xrange(0,len(M)): for word in findWords(M,D,x,y,""): yield word grid = "fxie amlo ewbx astu".split() print [x for x in solve(grid,D)]

Aquí está mi implementación de java: https://github.com/zouzhile/interview/blob/master/src/com/interview/algorithms/tree/BoggleSolver.java

Trie build tomó 0 horas, 0 minutos, 1 segundos, 532 milisegundos.
La búsqueda de palabras tomó 0 horas, 0 minutos, 0 segundos, 92 milisegundos.

eel eeler eely eer eke eker eld eleut elk ell elle epee epihippus ere erept err error erupt eurus eye eyer eyey hip hipe hiper hippish hipple hippus his hish hiss hist hler hsi ihi iphis isis issue issuer ist isurus kee keek keeker keel keeler keep keeper keld kele kelek kelep kelk kell kelly kelp kelper kep kepi kept ker kerel kern keup keuper key kyl kyle lee leek leeky leep leer lek leo leper leptus lepus ler leu ley lleu lue lull luller lulu lunn lunt lunule luo lupe lupis lupulus lupus lur lure lurer lush lushly lust lustrous lut lye nul null nun nupe nurture nurturer nut oer ore ort ouphish our oust out outpeep outpeer outpipe outpull outpush output outre outrun outrush outspell outspue outspurn outspurt outstrut outstunt outsulk outturn outusure oyer pee peek peel peele peeler peeoy peep peeper peepeye peer pele peleus pell peller pelu pep peplus pepper pepperer pepsis per pern pert pertussis peru perule perun peul phi pip pipe piper pipi pipistrel pipistrelle pipistrellus pipper pish piss pist plup plus plush ply plyer psi pst puerer pul pule puler pulk pull puller pulley pullus pulp pulper pulu puly pun punt pup puppis pur pure puree purely purer purr purre purree purrel purrer puru purupuru pus push puss pustule put putt puture ree reek reeker reeky reel reeler reeper rel rely reoutput rep repel repeller repipe reply repp reps reree rereel rerun reuel roe roer roey roue rouelle roun roup rouper roust rout roy rue ruelle ruer rule ruler rull ruller run runt rupee rupert rupture ruru rus rush russ rust rustre rut shi shih ship shipper shish shlu sip sipe siper sipper sis sish sisi siss sissu sist sistrurus speel speer spelk spell speller splurt spun spur spurn spurrer spurt sput ssi ssu stre stree streek streel streeler streep streke streperous strepsis strey stroup stroy stroyer strue strunt strut stu stue stull stuller stun stunt stupe stupeous stupp sturnus sturt stuss stut sue suer suerre suld sulk sulker sulky sull sully sulu sun sunn sunt sunup sup supe super superoutput supper supple supplely supply sur sure surely surrey sus susi susu susurr susurrous susurrus sutu suture suu tree treey trek trekker trey troupe trouper trout troy true truer trull truller truly trun trush truss trust tshi tst tsun tsutsutsi tue tule tulle tulu tun tunu tup tupek tupi tur turn turnup turr turus tush tussis tussur tut tuts tutu tutulus ule ull uller ulu ululu unreel unrule unruly unrun unrust untrue untruly untruss untrust unturn unurn upper upperer uppish uppishly uppull uppush upspurt upsun upsup uptree uptruss upturn ure urn uro uru urus urushi ush ust usun usure usurer utu yee yeel yeld yelk yell yeller yelp yelper yeo yep yer yere yern yoe yor yore you youl youp your yourn yoy

Nota: usé el diccionario y la matriz de caracteres al comienzo de este hilo. El código se ejecutó en mi MacBookPro, a continuación hay información sobre la máquina.

Nombre del modelo: MacBook Pro
Identificador del modelo: MacBookPro8,1
Nombre del procesador: Intel Core i5
Velocidad del procesador: 2.3 GHz
Número de procesadores: 1
Número total de núcleos: 2
L2 Caché (por núcleo): 256 KB
L3 Cache: 3 MB
Memoria: 4
Versión de ROM de arranque GB : MBP81.0047.B0E
Versión SMC (sistema): 1.68f96

Así que quise agregar otra forma de PHP para resolver esto, ya que todos aman PHP. Hay un poco de refactorización que me gustaría hacer, como usar una coincidencia de expresión regular contra el archivo del diccionario, pero ahora mismo estoy cargando todo el archivo del diccionario en una lista de palabras.

Hice esto usando una idea de lista enlazada. Cada nodo tiene un valor de carácter, un valor de ubicación y un puntero siguiente.

El valor de ubicación es cómo descubrí si dos nodos están conectados.

1 2 3 4 11 12 13 14 21 22 23 24 31 32 33 34

Entonces, al usar esa cuadrícula, sé que dos nodos están conectados si la ubicación del primer nodo es igual a la ubicación de los segundos nodos +/- 1 para la misma fila, +/- 9, 10, 11 para la fila arriba y abajo.

Uso la recursión para la búsqueda principal. Quita una palabra de la lista de palabras, encuentra todos los puntos de inicio posibles y luego recursivamente encuentra la siguiente conexión posible, teniendo en cuenta que no puede ir a una ubicación que ya está usando (por eso agrego $ notInLoc).

De todos modos, sé que necesita algo de refactorización, y me encantaría escuchar sus pensamientos sobre cómo hacerlo más limpio, pero produce los resultados correctos según el archivo de diccionario que estoy usando. Dependiendo de la cantidad de vocales y combinaciones en el tablero, toma alrededor de 3 a 6 segundos. Sé que una vez que pregelmare los resultados del diccionario, eso se reducirá significativamente.

<?php ini_set(''xdebug.var_display_max_depth'', 20); ini_set(''xdebug.var_display_max_children'', 1024); ini_set(''xdebug.var_display_max_data'', 1024); class Node { var $loc; function __construct($value) { $this->value = $value; $next = null; } } class Boggle { var $root; var $locList = array (1, 2, 3, 4, 11, 12, 13, 14, 21, 22, 23, 24, 31, 32, 33, 34); var $wordList = []; var $foundWords = []; function __construct($board) { // Takes in a board string and creates all the nodes $node = new Node($board[0]); $node->loc = $this->locList[0]; $this->root = $node; for ($i = 1; $i < strlen($board); $i++) { $node->next = new Node($board[$i]); $node->next->loc = $this->locList[$i]; $node = $node->next; } // Load in a dictionary file // Use regexp to elimate all the words that could never appear and load the // rest of the words into wordList $handle = fopen("dict.txt", "r"); if ($handle) { while (($line = fgets($handle)) !== false) { // process the line read. $line = trim($line); if (strlen($line) > 2) { $this->wordList[] = trim($line); } } fclose($handle); } else { // error opening the file. echo "Problem with the file."; } } function isConnected($node1, $node2) { // Determines if 2 nodes are connected on the boggle board return (($node1->loc == $node2->loc + 1) || ($node1->loc == $node2->loc - 1) || ($node1->loc == $node2->loc - 9) || ($node1->loc == $node2->loc - 10) || ($node1->loc == $node2->loc - 11) || ($node1->loc == $node2->loc + 9) || ($node1->loc == $node2->loc + 10) || ($node1->loc == $node2->loc + 11)) ? true : false; } function find($value, $notInLoc = []) { // Returns a node with the value that isn''t in a location $current = $this->root; while($current) { if ($current->value == $value && !in_array($current->loc, $notInLoc)) { return $current; } if (isset($current->next)) { $current = $current->next; } else { break; } } return false; } function findAll($value) { // Returns an array of nodes with a specific value $current = $this->root; $foundNodes = []; while ($current) { if ($current->value == $value) { $foundNodes[] = $current; } if (isset($current->next)) { $current = $current->next; } else { break; } } return (empty($foundNodes)) ? false : $foundNodes; } function findAllConnectedTo($node, $value, $notInLoc = []) { // Returns an array of nodes that are connected to a specific node and // contain a specific value and are not in a certain location $nodeList = $this->findAll($value); $newList = []; if ($nodeList) { foreach ($nodeList as $node2) { if (!in_array($node2->loc, $notInLoc) && $this->isConnected($node, $node2)) { $newList[] = $node2; } } } return (empty($newList)) ? false : $newList; } function inner($word, $list, $i = 0, $notInLoc = []) { $i++; foreach($list as $node) { $notInLoc[] = $node->loc; if ($list2 = $this->findAllConnectedTo($node, $word[$i], $notInLoc)) { if ($i == (strlen($word) - 1)) { return true; } else { return $this->inner($word, $list2, $i, $notInLoc); } } } return false; } function findWord($word) { if ($list = $this->findAll($word[0])) { return $this->inner($word, $list); } return false; } function findAllWords() { foreach($this->wordList as $word) { if ($this->findWord($word)) { $this->foundWords[] = $word; } } } function displayBoard() { $current = $this->root; for ($i=0; $i < 4; $i++) { echo $current->value . " " . $current->next->value . " " . $current->next->next->value . " " . $current->next->next->next->value . " "; if ($i < 3) { $current = $current->next->next->next->next; } } } } function randomBoardString() { return substr(str_shuffle(str_repeat("abcdefghijklmnopqrstuvwxyz", 16)), 0, 16); } $myBoggle = new Boggle(randomBoardString()); $myBoggle->displayBoard(); $x = microtime(true); $myBoggle->findAllWords(); $y = microtime(true); echo ($y-$x); var_dump($myBoggle->foundWords); ?>

Creo que probablemente pasará la mayor parte de su tiempo intentando hacer coincidir palabras que su cuadrícula de letras no puede construir. Entonces, lo primero que haría sería tratar de acelerar ese paso y eso debería hacer que llegues a la mayor parte del camino.

Para esto, reexpresaría la cuadrícula como una tabla de posibles "movimientos" que indexas por la letra de transición que estás viendo.

Comience asignando a cada letra un número de su alfabeto completo (A = 0, B = 1, C = 2, ... y así sucesivamente).

Tomemos este ejemplo:

h b c d e e g h l l k l m o f p

Y por ahora, usemos el alfabeto de las letras que tenemos (por lo general, probablemente querrá usar el mismo alfabeto completo cada vez):

b | c | d | e | f | g | h | k | l | m | o | p ---+---+---+---+---+---+---+---+---+---+----+---- 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11

Luego crea una matriz booleana 2D que le dice si tiene una cierta transición de letras disponible:

Ahora ve a través de tu lista de palabras y convierte las palabras a transiciones:

hello (6, 3, 8, 8, 10): 6 -> 3, 3 -> 8, 8 -> 8, 8 -> 10

Luego, verifique si se permiten estas transiciones buscándolas en su tabla:

[6][ 3] : T [3][ 8] : T [8][ 8] : T [8][10] : T

Si todos están permitidos, existe la posibilidad de que se encuentre esta palabra.

Por ejemplo, la palabra "casco" se puede descartar en la cuarta transición (m to e: helMEt), ya que esa entrada en su tabla es falsa.

Y la palabra hámster se puede descartar, ya que la primera transición (h a) no está permitida (ni siquiera existe en su tabla).

Ahora, para las probablemente muy pocas palabras restantes que no eliminó, trate de encontrarlas en la cuadrícula de la forma en que lo está haciendo ahora o como se sugiere en algunas de las otras respuestas aquí. Esto es para evitar falsos positivos que resulten de saltos entre letras idénticas en su cuadrícula. Por ejemplo, la palabra "ayuda" está permitida por la tabla, pero no por la cuadrícula.

Algunos consejos adicionales para mejorar el rendimiento de esta idea:

En lugar de usar una matriz 2D, use una matriz 1D y simplemente calcule usted mismo el índice de la segunda letra. Entonces, en lugar de una matriz de 12x12 como la anterior, haga una matriz 1D de longitud 144. Si siempre usa el mismo alfabeto (es decir, una matriz de 26x26 = 676x1 para el alfabeto inglés estándar), incluso si no aparecen todas las letras en su cuadrícula , puede pre-calcular los índices en esta matriz 1D que necesita probar para que coincida con sus palabras del diccionario. Por ejemplo, los índices para ''hola'' en el ejemplo anterior serían
hello (6, 3, 8, 8, 10): 42 (from 6 + 3x12), 99, 104, 128 -> "hello" will be stored as 42, 99, 104, 128 in the dictionary
Extienda la idea a una tabla 3D (expresada como una matriz 1D), es decir, todas las combinaciones permitidas de 3 letras. De esa manera, puede eliminar incluso más palabras inmediatamente y reducir el número de búsquedas de matrices para cada palabra en 1: Para "hola", solo necesita 3 búsquedas de matrices: hel, ell, llo. Por cierto, será muy rápido construir esta tabla, ya que solo hay 400 movimientos de 3 letras posibles en tu cuadrícula.
Calcule previamente los índices de los movimientos en su cuadrícula que necesita incluir en su tabla. Para el ejemplo anterior, necesita establecer las siguientes entradas en ''Verdadero'':
(0,0) (0,1) -> here: h, b : [6][0] (0,0) (1,0) -> here: h, e : [6][3] (0,0) (1,1) -> here: h, e : [6][3] (0,1) (0,0) -> here: b, h : [0][6] (0,1) (0,2) -> here: b, c : [0][1] . :
También represente su cuadrícula de juego en una matriz 1-D con 16 entradas y haga que la tabla esté precalculada en 3. contenga los índices en esta matriz.

Estoy seguro de que si utiliza este enfoque, puede hacer que su código se ejecute de forma increíblemente rápida, si tiene el diccionario previamente calculado y ya cargado en la memoria.

BTW: Otra cosa buena que hacer, si estás creando un juego, es ejecutar este tipo de cosas inmediatamente en segundo plano. Comience a generar y resolver el primer juego mientras el usuario sigue mirando la pantalla de título de su aplicación y coloque su dedo en posición para presionar "Jugar". Luego genera y resuelve el siguiente juego mientras el usuario juega el anterior. Eso debería darte mucho tiempo para ejecutar tu código.

(Me gusta este problema, así que probablemente me sienta tentado a implementar mi propuesta en Java en algún momento de los próximos días para ver cómo se desarrollaría realmente ... Publicaré el código aquí una vez que lo haga).

ACTUALIZAR:

Ok, tuve algo de tiempo hoy e implementé esta idea en Java:

class DictionaryEntry { public int[] letters; public int[] triplets; } class BoggleSolver { // Constants final int ALPHABET_SIZE = 5; // up to 2^5 = 32 letters final int BOARD_SIZE = 4; // 4x4 board final int[] moves = {-BOARD_SIZE-1, -BOARD_SIZE, -BOARD_SIZE+1, -1, +1, +BOARD_SIZE-1, +BOARD_SIZE, +BOARD_SIZE+1}; // Technically constant (calculated here for flexibility, but should be fixed) DictionaryEntry[] dictionary; // Processed word list int maxWordLength = 0; int[] boardTripletIndices; // List of all 3-letter moves in board coordinates DictionaryEntry[] buildDictionary(String fileName) throws IOException { BufferedReader fileReader = new BufferedReader(new FileReader(fileName)); String word = fileReader.readLine(); ArrayList<DictionaryEntry> result = new ArrayList<DictionaryEntry>(); while (word!=null) { if (word.length()>=3) { word = word.toUpperCase(); if (word.length()>maxWordLength) maxWordLength = word.length(); DictionaryEntry entry = new DictionaryEntry(); entry.letters = new int[word.length() ]; entry.triplets = new int[word.length()-2]; int i=0; for (char letter: word.toCharArray()) { entry.letters[i] = (byte) letter - 65; // Convert ASCII to 0..25 if (i>=2) entry.triplets[i-2] = (((entry.letters[i-2] << ALPHABET_SIZE) + entry.letters[i-1]) << ALPHABET_SIZE) + entry.letters[i]; i++; } result.add(entry); } word = fileReader.readLine(); } return result.toArray(new DictionaryEntry[result.size()]); } boolean isWrap(int a, int b) { // Checks if move a->b wraps board edge (like 3->4) return Math.abs(a%BOARD_SIZE-b%BOARD_SIZE)>1; } int[] buildTripletIndices() { ArrayList<Integer> result = new ArrayList<Integer>(); for (int a=0; a<BOARD_SIZE*BOARD_SIZE; a++) for (int bm: moves) { int b=a+bm; if ((b>=0) && (b<board.length) && !isWrap(a, b)) for (int cm: moves) { int c=b+cm; if ((c>=0) && (c<board.length) && (c!=a) && !isWrap(b, c)) { result.add(a); result.add(b); result.add(c); } } } int[] result2 = new int[result.size()]; int i=0; for (Integer r: result) result2[i++] = r; return result2; } // Variables that depend on the actual game layout int[] board = new int[BOARD_SIZE*BOARD_SIZE]; // Letters in board boolean[] possibleTriplets = new boolean[1 << (ALPHABET_SIZE*3)]; DictionaryEntry[] candidateWords; int candidateCount; int[] usedBoardPositions; DictionaryEntry[] foundWords; int foundCount; void initializeBoard(String[] letters) { for (int row=0; row<BOARD_SIZE; row++) for (int col=0; col<BOARD_SIZE; col++) board[row*BOARD_SIZE + col] = (byte) letters[row].charAt(col) - 65; } void setPossibleTriplets() { Arrays.fill(possibleTriplets, false); // Reset list int i=0; while (i<boardTripletIndices.length) { int triplet = (((board[boardTripletIndices[i++]] << ALPHABET_SIZE) + board[boardTripletIndices[i++]]) << ALPHABET_SIZE) + board[boardTripletIndices[i++]]; possibleTriplets[triplet] = true; } } void checkWordTriplets() { candidateCount = 0; for (DictionaryEntry entry: dictionary) { boolean ok = true; int len = entry.triplets.length; for (int t=0; (t<len) && ok; t++) ok = possibleTriplets[entry.triplets[t]]; if (ok) candidateWords[candidateCount++] = entry; } } void checkWords() { // Can probably be optimized a lot foundCount = 0; for (int i=0; i<candidateCount; i++) { DictionaryEntry candidate = candidateWords[i]; for (int j=0; j<board.length; j++) if (board[j]==candidate.letters[0]) { usedBoardPositions[0] = j; if (checkNextLetters(candidate, 1, j)) { foundWords[foundCount++] = candidate; break; } } } } boolean checkNextLetters(DictionaryEntry candidate, int letter, int pos) { if (letter==candidate.letters.length) return true; int match = candidate.letters[letter]; for (int move: moves) { int next=pos+move; if ((next>=0) && (next<board.length) && (board[next]==match) && !isWrap(pos, next)) { boolean ok = true; for (int i=0; (i<letter) && ok; i++) ok = usedBoardPositions[i]!=next; if (ok) { usedBoardPositions[letter] = next; if (checkNextLetters(candidate, letter+1, next)) return true; } } } return false; } // Just some helper functions String formatTime(long start, long end, long repetitions) { long time = (end-start)/repetitions; return time/1000000 + "." + (time/100000) % 10 + "" + (time/10000) % 10 + "ms"; } String getWord(DictionaryEntry entry) { char[] result = new char[entry.letters.length]; int i=0; for (int letter: entry.letters) result[i++] = (char) (letter+97); return new String(result); } void run() throws IOException { long start = System.nanoTime(); // The following can be pre-computed and should be replaced by constants dictionary = buildDictionary("C:/TWL06.txt"); boardTripletIndices = buildTripletIndices(); long precomputed = System.nanoTime(); // The following only needs to run once at the beginning of the program candidateWords = new DictionaryEntry[dictionary.length]; // WAAAY too generous foundWords = new DictionaryEntry[dictionary.length]; // WAAAY too generous usedBoardPositions = new int[maxWordLength]; long initialized = System.nanoTime(); for (int n=1; n<=100; n++) { // The following needs to run again for every new board initializeBoard(new String[] {"DGHI", "KLPS", "YEUT", "EORN"}); setPossibleTriplets(); checkWordTriplets(); checkWords(); } long solved = System.nanoTime(); // Print out result and statistics System.out.println("Precomputation finished in " + formatTime(start, precomputed, 1)+":"); System.out.println(" Words in the dictionary: "+dictionary.length); System.out.println(" Longest word: "+maxWordLength+" letters"); System.out.println(" Number of triplet-moves: "+boardTripletIndices.length/3); System.out.println(); System.out.println("Initialization finished in " + formatTime(precomputed, initialized, 1)); System.out.println(); System.out.println("Board solved in "+formatTime(initialized, solved, 100)+":"); System.out.println(" Number of candidates: "+candidateCount); System.out.println(" Number of actual words: "+foundCount); System.out.println(); System.out.println("Words found:"); int w=0; System.out.print(" "); for (int i=0; i<foundCount; i++) { System.out.print(getWord(foundWords[i])); w++; if (w==10) { w=0; System.out.println(); System.out.print(" "); } else if (i<foundCount-1) System.out.print(", "); } System.out.println(); } public static void main(String[] args) throws IOException { new BoggleSolver().run(); } }

Aquí hay algunos resultados:

Para la cuadrícula de la imagen publicada en la pregunta original (DGHI ...):

Precomputation finished in 239.59ms: Words in the dictionary: 178590 Longest word: 15 letters Number of triplet-moves: 408 Initialization finished in 0.22ms Board solved in 3.70ms: Number of candidates: 230 Number of actual words: 163 Words found: eek, eel, eely, eld, elhi, elk, ern, erupt, erupts, euro eye, eyer, ghi, ghis, glee, gley, glue, gluer, gluey, glut gluts, hip, hiply, hips, his, hist, kelp, kelps, kep, kepi kepis, keps, kept, kern, key, kye, lee, lek, lept, leu ley, lunt, lunts, lure, lush, lust, lustre, lye, nus, nut nuts, ore, ort, orts, ouph, ouphs, our, oust, out, outre outs, oyer, pee, per, pert, phi, phis, pis, pish, plus plush, ply, plyer, psi, pst, pul, pule, puler, pun, punt punts, pur, pure, puree, purely, pus, push, put, puts, ree rely, rep, reply, reps, roe, roue, roup, roups, roust, rout routs, rue, rule, ruly, run, runt, runts, rupee, rush, rust rut, ruts, ship, shlep, sip, sipe, spue, spun, spur, spurn spurt, strep, stroy, stun, stupe, sue, suer, sulk, sulker, sulky sun, sup, supe, super, sure, surely, tree, trek, trey, troupe troy, true, truly, tule, tun, tup, tups, turn, tush, ups urn, uts, yeld, yelk, yelp, yelps, yep, yeps, yore, you your, yourn, yous

Para las cartas publicadas como ejemplo en la pregunta original (FXIE ...)

Precomputation finished in 239.68ms: Words in the dictionary: 178590 Longest word: 15 letters Number of triplet-moves: 408 Initialization finished in 0.21ms Board solved in 3.69ms: Number of candidates: 87 Number of actual words: 76 Words found: amble, ambo, ami, amie, asea, awa, awe, awes, awl, axil axile, axle, boil, bole, box, but, buts, east, elm, emboli fame, fames, fax, lei, lie, lima, limb, limbo, limbs, lime limes, lob, lobs, lox, mae, maes, maw, maws, max, maxi mesa, mew, mewl, mews, mil, mile, milo, mix, oil, ole sae, saw, sea, seam, semi, sew, stub, swam, swami, tub tubs, tux, twa, twae, twaes, twas, uts, wae, waes, wamble wame, wames, was, wast, wax, west

Para la siguiente cuadrícula 5x5:

R P R I T A H H L N I E T E P Z R Y S G O G W E Y

da esto:

Precomputation finished in 240.39ms: Words in the dictionary: 178590 Longest word: 15 letters Number of triplet-moves: 768 Initialization finished in 0.23ms Board solved in 3.85ms: Number of candidates: 331 Number of actual words: 240 Words found: aero, aery, ahi, air, airt, airth, airts, airy, ear, egest elhi, elint, erg, ergo, ester, eth, ether, eye, eyen, eyer eyes, eyre, eyrie, gel, gelt, gelts, gen, gent, gentil, gest geste, get, gets, gey, gor, gore, gory, grey, greyest, greys gyre, gyri, gyro, hae, haet, haets, hair, hairy, hap, harp heap, hear, heh, heir, help, helps, hen, hent, hep, her hero, hes, hest, het, hetero, heth, hets, hey, hie, hilt hilts, hin, hint, hire, hit, inlet, inlets, ire, leg, leges legs, lehr, lent, les, lest, let, lethe, lets, ley, leys lin, line, lines, liney, lint, lit, neg, negs, nest, nester net, nether, nets, nil, nit, ogre, ore, orgy, ort, orts pah, pair, par, peg, pegs, peh, pelt, pelter, peltry, pelts pen, pent, pes, pest, pester, pesty, pet, peter, pets, phi philter, philtre, phiz, pht, print, pst, rah, rai, rap, raphe raphes, reap, rear, rei, ret, rete, rets, rhaphe, rhaphes, rhea ria, rile, riles, riley, rin, rye, ryes, seg, sel, sen sent, senti, set, sew, spelt, spelter, spent, splent, spline, splint split, stent, step, stey, stria, striae, sty, stye, tea, tear teg, tegs, tel, ten, tent, thae, the, their, then, these thesp, they, thin, thine, thir, thirl, til, tile, tiles, tilt tilter, tilth, tilts, tin, tine, tines, tirl, trey, treys, trog try, tye, tyer, tyes, tyre, tyro, west, wester, wry, wryest wye, wyes, wyte, wytes, yea, yeah, year, yeh, yelp, yelps yen, yep, yeps, yes, yester, yet, yew, yews, zero, zori

Para esto utilicé la Lista de palabras del Scrabble del Torneo TWL06 , ya que el enlace en la pregunta original ya no funciona. Este archivo es de 1.85MB, por lo que es un poco más corto. Y la buildDictionaryfunción arroja todas las palabras con menos de 3 letras.

Aquí hay un par de observaciones sobre el desempeño de esto:

Es aproximadamente 10 veces más lento que el rendimiento reportado de la implementación OCaml de Victor Nicollet. Si esto es causado por el algoritmo diferente, el diccionario más corto que utilizó, el hecho de que su código se compila y el mío se ejecuta en una máquina virtual Java, o el rendimiento de nuestras computadoras (el mío es un Intel Q6600 @ 2.4MHz con WinXP), No lo sé. Pero es mucho más rápido que los resultados para las otras implementaciones citadas al final de la pregunta original. Por lo tanto, si este algoritmo es superior al diccionario trie o no, no lo sé en este momento.
El método de tabla utilizado en checkWordTriplets()produce una muy buena aproximación a las respuestas reales. Solo 1 de cada 3-5 palabras aprobadas pasará la checkWords()prueba (vea el número de candidatos frente al número de palabras reales arriba).
Algo que no puede ver arriba: la checkWordTriplets()función toma aproximadamente 3,65 ms y, por lo tanto, es totalmente dominante en el proceso de búsqueda. La checkWords()función ocupa más o menos los 0.05-0.20 ms restantes.
El tiempo de ejecución de la checkWordTriplets()función depende linealmente del tamaño del diccionario y es virtualmente independiente del tamaño del tablero.
El tiempo de ejecución de checkWords()depende del tamaño del tablero y el número de palabras no descartadas por checkWordTriplets().
La checkWords()implementación anterior es la primera versión más tonta que se me ocurrió. Básicamente no está optimizado en absoluto. Pero en comparación con checkWordTriplets()esto, es irrelevante para el rendimiento total de la aplicación, así que no me preocupé por eso. Pero , si el tamaño de la placa aumenta, esta función se volverá más lenta y eventualmente empezará a importar. Entonces, tendría que ser optimizado también.
Una cosa buena de este código es su flexibilidad:
- Puede cambiar fácilmente el tamaño de la placa: Actualice la línea 10 y la matriz de cadenas a la que se pasó initializeBoard().
- Puede admitir alfabetos más grandes / diferentes y puede manejar cosas como tratar ''Qu'' como una sola letra sin sobrecarga de rendimiento. Para hacer esto, uno tendría que actualizar la línea 9 y el par de lugares donde los caracteres se convierten en números (actualmente, simplemente restando 65 del valor ASCII)

Ok, pero creo que a esta altura esta publicación es lo suficientemente larga. Definitivamente puedo responder cualquier pregunta que pueda tener, pero pasemos eso a los comentarios.

Divertidísimo. ¡Casi publiqué la misma pregunta hace unos días debido al mismo maldito juego! Sin embargo, no lo hice porque solo busqué en Google por Python solucionador de boggle y obtuve todas las respuestas que podía desear.

Escribí mi solucionador en C ++. Implementé una estructura de árbol personalizada. No estoy seguro de que pueda considerarse un trío, pero es similar. Cada nodo tiene 26 ramas, 1 por cada letra del alfabeto. Atravieso las ramas de la tabla de boggle en paralelo con las ramas de mi diccionario. Si la rama no existe en el diccionario, dejo de buscarla en el tablero de Boggle. Convierto todas las letras del tablero a ints. Entonces ''A'' = 0. Dado que solo son matrices, la búsqueda siempre es O (1). Cada nodo almacena si completa una palabra y cuántas palabras existen en sus hijos. El árbol se poda a medida que se encuentra que las palabras reducen la búsqueda repetida de las mismas palabras. Creo que la poda también es O (1).

CPU: Pentium SU2700 1.3GHz
RAM: 3gb

Carga el diccionario de 178,590 palabras en <1 segundo.
Resuelve 100x100 Boggle (boggle.txt) en 4 segundos. ~ 44,000 palabras encontradas.
Resolver un Boggle 4x4 es demasiado rápido para proporcionar un punto de referencia significativo. :)

Fast Boggle Solver GitHub Repo

He implementado una solución en OCaml . Pre-compila un diccionario como un trie, y utiliza frecuencias de secuencia de dos letras para eliminar los bordes que nunca podrían aparecer en una palabra para acelerar aún más el procesamiento.

Resuelve su ejemplo de placa en 0,35 ms (con un tiempo de inicio adicional de 6 ms que se relaciona principalmente con la carga del trie en la memoria).

Las soluciones encontradas:

["swami"; "emile"; "limbs"; "limbo"; "limes"; "amble"; "tubs"; "stub"; "swam"; "semi"; "seam"; "awes"; "buts"; "bole"; "boil"; "west"; "east"; "emil"; "lobs"; "limb"; "lime"; "lima"; "mesa"; "mews"; "mewl"; "maws"; "milo"; "mile"; "awes"; "amie"; "axle"; "elma"; "fame"; "ubs"; "tux"; "tub"; "twa"; "twa"; "stu"; "saw"; "sea"; "sew"; "sea"; "awe"; "awl"; "but"; "btu"; "box"; "bmw"; "was"; "wax"; "oil"; "lox"; "lob"; "leo"; "lei"; "lie"; "mes"; "mew"; "mae"; "maw"; "max"; "mil"; "mix"; "awe"; "awl"; "elm"; "eli"; "fax"]

He resuelto esto en C #, utilizando un algoritmo DFA. Puedes ver mi código en

https://github.com/attilabicsko/wordshuffler/

Además de encontrar palabras en una matriz, mi algoritmo guarda las rutas reales de las palabras, por lo que para diseñar un juego de búsqueda de palabras, puede verificar si hay una palabra en una ruta real.

Primero, lea cómo uno de los diseñadores de lenguaje C # resolvió un problema relacionado: http://blogs.msdn.com/ericlippert/archive/2009/02/04/a-nasality-talisman-for-the-sultana-analyst.aspx .

Al igual que él, puede comenzar con un diccionario y las palabras canonacalize creando un diccionario a partir de una serie de letras ordenadas alfabéticamente en una lista de palabras que se pueden deletrear a partir de esas letras.

A continuación, comience a crear las palabras posibles del tablero y búsquelos. Sospecho que eso te llevará bastante lejos, pero ciertamente hay más trucos que podrían acelerar las cosas.

Resolví esto en c. Tarda unos 48 ms en ejecutarse en mi máquina (con un 98% del tiempo dedicado a cargar el diccionario desde el disco y crear el trie) El diccionario es / usr / share / dict / american-english que tiene 62886 palabras.

Código fuente

Resolví esto perfectamente y muy rápido. Lo puse en una aplicación de Android. Vea el video en el enlace de Play Store para verlo en acción.

Word Cheats es una aplicación que "agrieta" cualquier juego de palabras de estilo matricial. Esta aplicación fue construida para ayudarme a hacer trampa en el codificador de palabras. ¡Se puede utilizar para búsquedas de palabras, ruzzle, palabras, buscador de palabras, crack de palabras, boggle, y más!

Se puede ver aquí https://play.google.com/store/apps/details?id=com.harris.wordcracker

Vea la aplicación en acción en el video https://www.youtube.com/watch?v=DL2974WmNAI

Resolví esto también, con Java. Mi implementación es de 269 líneas y es bastante fácil de usar. Primero debe crear una nueva instancia de la clase Boggler y luego llamar a la función de resolución con la cuadrícula como parámetro. Se requieren aproximadamente 100 ms para cargar el diccionario de 50 000 palabras en mi computadora y las encuentra en aproximadamente 10 a 20 ms. Las palabras encontradas se almacenan en un ArrayList, foundWords.

import java.io.BufferedReader; import java.io.File; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.IOException; import java.io.InputStreamReader; import java.net.URISyntaxException; import java.net.URL; import java.util.ArrayList; import java.util.Arrays; import java.util.Comparator; public class Boggler { private ArrayList<String> words = new ArrayList<String>(); private ArrayList<String> roundWords = new ArrayList<String>(); private ArrayList<Word> foundWords = new ArrayList<Word>(); private char[][] letterGrid = new char[4][4]; private String letters; public Boggler() throws FileNotFoundException, IOException, URISyntaxException { long startTime = System.currentTimeMillis(); URL path = GUI.class.getResource("words.txt"); BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(new File(path.toURI()).getAbsolutePath()), "iso-8859-1")); String line; while((line = br.readLine()) != null) { if(line.length() < 3 || line.length() > 10) { continue; } this.words.add(line); } } public ArrayList<Word> getWords() { return this.foundWords; } public void solve(String letters) { this.letters = ""; this.foundWords = new ArrayList<Word>(); for(int i = 0; i < letters.length(); i++) { if(!this.letters.contains(letters.substring(i, i + 1))) { this.letters += letters.substring(i, i + 1); } } for(int i = 0; i < 4; i++) { for(int j = 0; j < 4; j++) { this.letterGrid[i][j] = letters.charAt(i * 4 + j); } } System.out.println(Arrays.deepToString(this.letterGrid)); this.roundWords = new ArrayList<String>(); String pattern = "[" + this.letters + "]+"; for(int i = 0; i < this.words.size(); i++) { if(this.words.get(i).matches(pattern)) { this.roundWords.add(this.words.get(i)); } } for(int i = 0; i < this.roundWords.size(); i++) { Word word = checkForWord(this.roundWords.get(i)); if(word != null) { System.out.println(word); this.foundWords.add(word); } } } private Word checkForWord(String word) { char initial = word.charAt(0); ArrayList<LetterCoord> startPoints = new ArrayList<LetterCoord>(); int x = 0; int y = 0; for(char[] row: this.letterGrid) { x = 0; for(char letter: row) { if(initial == letter) { startPoints.add(new LetterCoord(x, y)); } x++; } y++; } ArrayList<LetterCoord> letterCoords = null; for(int initialTry = 0; initialTry < startPoints.size(); initialTry++) { letterCoords = new ArrayList<LetterCoord>(); x = startPoints.get(initialTry).getX(); y = startPoints.get(initialTry).getY(); LetterCoord initialCoord = new LetterCoord(x, y); letterCoords.add(initialCoord); letterLoop: for(int letterIndex = 1; letterIndex < word.length(); letterIndex++) { LetterCoord lastCoord = letterCoords.get(letterCoords.size() - 1); char currentChar = word.charAt(letterIndex); ArrayList<LetterCoord> letterLocations = getNeighbours(currentChar, lastCoord.getX(), lastCoord.getY()); if(letterLocations == null) { return null; } for(int foundIndex = 0; foundIndex < letterLocations.size(); foundIndex++) { if(letterIndex != word.length() - 1 && true == false) { char nextChar = word.charAt(letterIndex + 1); int lastX = letterCoords.get(letterCoords.size() - 1).getX(); int lastY = letterCoords.get(letterCoords.size() - 1).getY(); ArrayList<LetterCoord> possibleIndex = getNeighbours(nextChar, lastX, lastY); if(possibleIndex != null) { if(!letterCoords.contains(letterLocations.get(foundIndex))) { letterCoords.add(letterLocations.get(foundIndex)); } continue letterLoop; } else { return null; } } else { if(!letterCoords.contains(letterLocations.get(foundIndex))) { letterCoords.add(letterLocations.get(foundIndex)); continue letterLoop; } } } } if(letterCoords != null) { if(letterCoords.size() == word.length()) { Word w = new Word(word); w.addList(letterCoords); return w; } else { return null; } } } if(letterCoords != null) { Word foundWord = new Word(word); foundWord.addList(letterCoords); return foundWord; } return null; } public ArrayList<LetterCoord> getNeighbours(char letterToSearch, int x, int y) { ArrayList<LetterCoord> neighbours = new ArrayList<LetterCoord>(); for(int _y = y - 1; _y <= y + 1; _y++) { for(int _x = x - 1; _x <= x + 1; _x++) { if(_x < 0 || _y < 0 || (_x == x && _y == y) || _y > 3 || _x > 3) { continue; } if(this.letterGrid[_y][_x] == letterToSearch && !neighbours.contains(new LetterCoord(_x, _y))) { neighbours.add(new LetterCoord(_x, _y)); } } } if(neighbours.isEmpty()) { return null; } else { return neighbours; } } } class Word { private String word; private ArrayList<LetterCoord> letterCoords = new ArrayList<LetterCoord>(); public Word(String word) { this.word = word; } public boolean addCoords(int x, int y) { LetterCoord lc = new LetterCoord(x, y); if(!this.letterCoords.contains(lc)) { this.letterCoords.add(lc); return true; } return false; } public void addList(ArrayList<LetterCoord> letterCoords) { this.letterCoords = letterCoords; } @Override public String toString() { String outputString = this.word + " "; for(int i = 0; i < letterCoords.size(); i++) { outputString += "(" + letterCoords.get(i).getX() + ", " + letterCoords.get(i).getY() + ") "; } return outputString; } public String getWord() { return this.word; } public ArrayList<LetterCoord> getList() { return this.letterCoords; } } class LetterCoord extends ArrayList { private int x; private int y; public LetterCoord(int x, int y) { this.x = x; this.y = y; } public int getX() { return this.x; } public int getY() { return this.y; } @Override public boolean equals(Object o) { if(!(o instanceof LetterCoord)) { return false; } LetterCoord lc = (LetterCoord) o; if(this.x == lc.getX() && this.y == lc.getY()) { return true; } return false; } @Override public int hashCode() { int hash = 7; hash = 29 * hash + this.x; hash = 24 * hash + this.y; return hash; } }

Sé que llego muy tarde pero hace un tiempo hice uno de estos en PHP , solo por diversión ...

http://www.lostsockdesign.com.au/sandbox/boggle/index.php?letters=fxieamloewbxastu Encontró 75 palabras (133 pts) en 0.90108 segundos

F.........X..I..............E............... A......................................M..............................L............................O............................... E....................W............................B..........................X A..................S..................................................T.................U....

Da una indicación de lo que realmente está haciendo el programa: cada letra es donde comienza a mirar a través de los patrones mientras que cada ''''. Muestra un camino que ha tratado de tomar. Cuanto más ''.'' Hay más lejos que ha buscado.

Déjame saber si quieres el código ... es una mezcla horrible de PHP y HTML que nunca tuvo la intención de ver la luz del día, así que no me atrevo a publicarlo aquí: P

Sólo por diversión, implementé uno en bash. No es super rápido, sino razonable.

http://dev.xkyle.com/bashboggle/

Sugiero hacer un árbol de letras basado en palabras. El árbol estaría compuesto por una estructura de letras, como esta:

letter: char isWord: boolean

Luego construyes el árbol, con cada profundidad agregando una nueva letra. En otras palabras, en el primer nivel estaría el alfabeto; luego, de cada uno de esos árboles, habría otras 26 entradas, y así sucesivamente, hasta que haya escrito todas las palabras. Aferrarse a este árbol analizado, hará que todas las respuestas posibles sean más rápidas para buscar.

Con este árbol analizado, puede encontrar soluciones muy rápidamente. Aquí está el pseudo-código:

BEGIN: For each letter: if the struct representing it on the current depth has isWord == true, enter it as an answer. Cycle through all its neighbors; if there is a child of the current node corresponding to the letter, recursively call BEGIN on it.

Esto podría acelerarse con un poco de programación dinámica. Por ejemplo, en su muestra, las dos ''A'' están junto a una ''E'' y una ''W'', que (desde el punto en que las golpean) serían idénticas. No tengo suficiente tiempo para deletrear realmente el código para esto, pero creo que puede reunir la idea.

Además, estoy seguro de que encontrará otras soluciones si busca en Google "Solver Boggle".

Tan pronto como vi la declaración del problema, pensé "Trie". Pero al ver que varios otros carteles utilizaban ese enfoque, busqué otro enfoque solo para ser diferente. Por desgracia, el enfoque Trie funciona mejor. Corrí la solución Perl de Kent en mi máquina y tardé 0.31 segundos en ejecutarse, después de adaptarla para usar mi archivo de diccionario. Mi propia implementación de perl requirió 0.54 segundos para ejecutarse.

Este fue mi enfoque:

Crea un hash de transición para modelar las transiciones legales.
Iterar a través de las 16 ^ 3 posibles combinaciones de tres letras.
- En el bucle, excluya las transiciones ilegales y repita las visitas al mismo cuadrado. Forme todas las secuencias legales de 3 letras y almacénelas en un hash.
Luego recorre todas las palabras en el diccionario.
- Excluye palabras que sean demasiado largas o cortas
- Deslice una ventana de 3 letras a través de cada palabra y vea si se encuentra entre los combos de 3 letras del paso 2. Excluya las palabras que fallan. Esto elimina la mayoría de los no partidos.
- Si aún no se elimina, use un algoritmo recursivo para ver si la palabra se puede formar haciendo caminos a través del rompecabezas. (Esta parte es lenta, pero se llama con poca frecuencia).
Imprime las palabras que encontré.
Intenté secuencias de 3 y 4 letras, pero las secuencias de 4 letras ralentizaron el programa.

En mi código, uso / usr / share / dict / words para mi diccionario. Viene estándar en MAC OS X y muchos sistemas Unix. Puedes usar otro archivo si quieres. Para romper un rompecabezas diferente, solo cambia la variable @puzzle. Esto sería fácil de adaptar para matrices más grandes. Solo necesitarías cambiar el hash% transitions y el hash% legalTransitions.

La fortaleza de esta solución es que el código es corto y las estructuras de datos simples.

Aquí está el código Perl (que usa demasiadas variables globales, lo sé):

#!/usr/bin/perl use Time::HiRes qw{ time }; sub readFile($); sub findAllPrefixes($); sub isWordTraceable($); sub findWordsInPuzzle(@); my $startTime = time; # Puzzle to solve my @puzzle = ( F, X, I, E, A, M, L, O, E, W, B, X, A, S, T, U ); my $minimumWordLength = 3; my $maximumPrefixLength = 3; # I tried four and it slowed down. # Slurp the word list. my $wordlistFile = "/usr/share/dict/words"; my @words = split(//n/, uc(readFile($wordlistFile))); print "Words loaded from word list: " . scalar @words . "/n"; print "Word file load time: " . (time - $startTime) . "/n"; my $postLoad = time; # Define the legal transitions from one letter position to another. # Positions are numbered 0-15. # 0 1 2 3 # 4 5 6 7 # 8 9 10 11 # 12 13 14 15 my %transitions = ( -1 => [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15], 0 => [1,4,5], 1 => [0,2,4,5,6], 2 => [1,3,5,6,7], 3 => [2,6,7], 4 => [0,1,5,8,9], 5 => [0,1,2,4,6,8,9,10], 6 => [1,2,3,5,7,9,10,11], 7 => [2,3,6,10,11], 8 => [4,5,9,12,13], 9 => [4,5,6,8,10,12,13,14], 10 => [5,6,7,9,11,13,14,15], 11 => [6,7,10,14,15], 12 => [8,9,13], 13 => [8,9,10,12,14], 14 => [9,10,11,13,15], 15 => [10,11,14] ); # Convert the transition matrix into a hash for easy access. my %legalTransitions = (); foreach my $start (keys %transitions) { my $legalRef = $transitions{$start}; foreach my $stop (@$legalRef) { my $index = ($start + 1) * (scalar @puzzle) + ($stop + 1); $legalTransitions{$index} = 1; } } my %prefixesInPuzzle = findAllPrefixes($maximumPrefixLength); print "Find prefixes time: " . (time - $postLoad) . "/n"; my $postPrefix = time; my @wordsFoundInPuzzle = findWordsInPuzzle(@words); print "Find words in puzzle time: " . (time - $postPrefix) . "/n"; print "Unique prefixes found: " . (scalar keys %prefixesInPuzzle) . "/n"; print "Words found (" . (scalar @wordsFoundInPuzzle) . ") :/n " . join("/n ", @wordsFoundInPuzzle) . "/n"; print "Total Elapsed time: " . (time - $startTime) . "/n"; ########################################### sub readFile($) { my ($filename) = @_; my $contents; if (-e $filename) { # This is magic: it opens and reads a file into a scalar in one line of code. # See http://www.perl.com/pub/a/2003/11/21/slurp.html $contents = do { local( @ARGV, $/ ) = $filename ; <> } ; } else { $contents = ''''; } return $contents; } # Is it legal to move from the first position to the second? They must be adjacent. sub isLegalTransition($$) { my ($pos1,$pos2) = @_; my $index = ($pos1 + 1) * (scalar @puzzle) + ($pos2 + 1); return $legalTransitions{$index}; } # Find all prefixes where $minimumWordLength <= length <= $maxPrefixLength # # $maxPrefixLength ... Maximum length of prefix we will store. Three gives best performance. sub findAllPrefixes($) { my ($maxPrefixLength) = @_; my %prefixes = (); my $puzzleSize = scalar @puzzle; # Every possible N-letter combination of the letters in the puzzle # can be represented as an integer, though many of those combinations # involve illegal transitions, duplicated letters, etc. # Iterate through all those possibilities and eliminate the illegal ones. my $maxIndex = $puzzleSize ** $maxPrefixLength; for (my $i = 0; $i < $maxIndex; $i++) { my @path; my $remainder = $i; my $prevPosition = -1; my $prefix = ''''; my %usedPositions = (); for (my $prefixLength = 1; $prefixLength <= $maxPrefixLength; $prefixLength++) { my $position = $remainder % $puzzleSize; # Is this a valid step? # a. Is the transition legal (to an adjacent square)? if (! isLegalTransition($prevPosition, $position)) { last; } # b. Have we repeated a square? if ($usedPositions{$position}) { last; } else { $usedPositions{$position} = 1; } # Record this prefix if length >= $minimumWordLength. $prefix .= $puzzle[$position]; if ($prefixLength >= $minimumWordLength) { $prefixes{$prefix} = 1; } push @path, $position; $remainder -= $position; $remainder /= $puzzleSize; $prevPosition = $position; } # end inner for } # end outer for return %prefixes; } # Loop through all words in dictionary, looking for ones that are in the puzzle. sub findWordsInPuzzle(@) { my @allWords = @_; my @wordsFound = (); my $puzzleSize = scalar @puzzle; WORD: foreach my $word (@allWords) { my $wordLength = length($word); if ($wordLength > $puzzleSize || $wordLength < $minimumWordLength) { # Reject word as too short or too long. } elsif ($wordLength <= $maximumPrefixLength ) { # Word should be in the prefix hash. if ($prefixesInPuzzle{$word}) { push @wordsFound, $word; } } else { # Scan through the word using a window of length $maximumPrefixLength, looking for any strings not in our prefix list. # If any are found that are not in the list, this word is not possible. # If no non-matches are found, we have more work to do. my $limit = $wordLength - $maximumPrefixLength + 1; for (my $startIndex = 0; $startIndex < $limit; $startIndex ++) { if (! $prefixesInPuzzle{substr($word, $startIndex, $maximumPrefixLength)}) { next WORD; } } if (isWordTraceable($word)) { # Additional test necessary: see if we can form this word by following legal transitions push @wordsFound, $word; } } } return @wordsFound; } # Is it possible to trace out the word using only legal transitions? sub isWordTraceable($) { my $word = shift; return traverse([split(//, $word)], [-1]); # Start at special square -1, which may transition to any square in the puzzle. } # Recursively look for a path through the puzzle that matches the word. sub traverse($$) { my ($lettersRef, $pathRef) = @_; my $index = scalar @$pathRef - 1; my $position = $pathRef->[$index]; my $letter = $lettersRef->[$index]; my $branchesRef = $transitions{$position}; BRANCH: foreach my $branch (@$branchesRef) { if ($puzzle[$branch] eq $letter) { # Have we used this position yet? foreach my $usedBranch (@$pathRef) { if ($usedBranch == $branch) { next BRANCH; } } if (scalar @$lettersRef == $index + 1) { return 1; # End of word and success. } push @$pathRef, $branch; if (traverse($lettersRef, $pathRef)) { return 1; # Recursive success. } else { pop @$pathRef; } } } return 0; # No path found. Failed. }

Tendría que pensar más en una solución completa, pero como una optimización práctica, me pregunto si valdría la pena pre-calcular una tabla de frecuencias de digramas y trigramas (combinaciones de 2 y 3 letras) basada en todos los palabras de su diccionario, y use esto para priorizar su búsqueda. Me gustaría ir con las letras iniciales de las palabras. Entonces, si su diccionario contiene las palabras "India", "Agua", "Extremo" y "Extraordinario", entonces su tabla precalculada podría ser:

''IN'': 1 ''WA'': 1 ''EX'': 2

Luego busque estos digrams en el orden de los puntos comunes (primero EX, luego WA / IN)

Una solución de JavaScript Node.JS. Calcula las 100 palabras únicas en menos de un segundo, lo que incluye leer el archivo del diccionario (MBA 2012).

Salida:
["FAM", "TUX", "TUB", "FAE", "ELI", "ELM", "ELB", "TWA", "TWA", "SAW", "AMI", "SWA", " SWA "," AME "," SEA "," SEW "," AES "," AWL "," AWE "," SEA "," AWA "," MIX "," MIL "," AST "," ASE " , "MAX", "MAE", "MAW", "MEW", "AWE", "MES", "AWL", "LIE", "LIM", "AWA", "AES", "PERO", " BLO "," WAS "," WAE "," WEA "," LEI "," LEO "," LOB "," LOX "," WEM "," ACEITE "," OLM "," WEA "," WAE " , "CERA", "WAF", "MILO", "ESTE "," WAME "," TWAS "," TWAE "," EMIL "," WEAM "," OIME "," AXIL "," WEST "," TWAE "," LIMB "," WASE "," WAST " , "BLEO", "STUB", "BOIL", "BOLE", "LIME", "SAWT", "LIMA", "MESA", "MEWL", "EJE", "FAME", "ASEM", " MILLA "," AMIL "," SEAX "," SEAM "," SEMI "," SWAM "," AMBO "," AMLI "," AXILE "," AMBLE "," SWAMI "," AWEST "," AWEST " , "LIMAX", "LIMES", "LIMBU", "LIMBO", "EMBOX", "SEMBLE", "EMBOLE", "WAMBLE", "FAMBLE"]TWAE "," EMIL "," WEAM "," OIME "," AXIL "," WEST "," TWAE "," LIMB "," WASE "," WAST "," BLEO "," STUB "," BOIL " , "BOLE", "LIME", "SAWT", "LIMA", "MESA", "MEWL", "EJE", "FAME", "ASEM", "MILE", "AMIL", "SEAX", " SEAM "," SEMI "," SWAM "," AMBO "," AMLI "," AXILE "," AMBLE "," SWAMI "," AWEST "," AWEST "," LIMAX "," LIMES "," LIMBU " , "LIMBO", "EMBOX", "SEMBLE", "EMBOLE", "WAMBLE", "FAMBLE"]TWAE "," EMIL "," WEAM "," OIME "," AXIL "," WEST "," TWAE "," LIMB "," WASE "," WAST "," BLEO "," STUB "," BOIL " , "BOLE", "LIME", "SAWT", "LIMA", "MESA", "MEWL", "EJE", "FAME", "ASEM", "MILE", "AMIL", "SEAX", " SEAM "," SEMI "," SWAM "," AMBO "," AMLI "," AXILE "," AMBLE "," SWAMI "," AWEST "," AWEST "," LIMAX "," LIMES "," LIMBU " , "LIMBO", "EMBOX", "SEMBLE", "EMBOLE", "WAMBLE", "FAMBLE"]"TWAE", "LIMB", "WASE", "WAST", "BLEO", "STUB", "BOIL", "BOLE", "LIME", "SAWT", "LIMA", "MESA", "MEWL "," EJE "," FAMA "," ASEM "," MILLA "," AMIL "," SEAX "," SEAM "," SEMI "," SWAM "," AMBO "," AMLI "," AXILE ", "AMBLE", "SWAMI", "AWEST", "AWEST", "LIMAX", "LIMES", "LIMBU", "LIMBO", "EMBOX", "SEMBLE", "EMBOLE", "WAMBLE", "FAMBLE "]"TWAE", "LIMB", "WASE", "WAST", "BLEO", "STUB", "BOIL", "BOLE", "LIME", "SAWT", "LIMA", "MESA", "MEWL "," EJE "," FAMA "," ASEM "," MILLA "," AMIL "," SEAX "," SEAM "," SEMI "," SWAM "," AMBO "," AMLI "," AXILE ", "AMBLE", "SWAMI", "AWEST", "AWEST", "LIMAX", "LIMES", "LIMBU", "LIMBO", "EMBOX", "SEMBLE", "EMBOLE", "WAMBLE", "FAMBLE "]MESA "," MEWL "," EJE "," FAMA "," ASEM "," MILLA "," AMIL "," SEAX "," SEAM "," SEMI "," SWAM "," AMBO "," AMLI " , "AXILE", "AMBLE", "SWAMI", "AWEST", "AWEST", "LIMAX", "LIMES", "LIMBU", "LIMBO", "EMBOX", "SEMBLE", "EMBOLE", " WAMBLE "," FAMBLE "]MESA "," MEWL "," EJE "," FAMA "," ASEM "," MILLA "," AMIL "," SEAX "," SEAM "," SEMI "," SWAM "," AMBO "," AMLI " , "AXILE", "AMBLE", "SWAMI", "AWEST", "AWEST", "LIMAX", "LIMES", "LIMBU", "LIMBO", "EMBOX", "SEMBLE", "EMBOLE", " WAMBLE "," FAMBLE "]EMBOX "," SEMBLE "," EMBOLE "," WAMBLE "," FAMBLE "]EMBOX "," SEMBLE "," EMBOLE "," WAMBLE "," FAMBLE "]

Código: