data - Implementación de conjuntos disjuntos(búsqueda de unión) en C++

union find c++ (10)

Estoy tratando de implementar Conjuntos Disjoint para uso en el algoritmo de Kruskal, pero tengo problemas para entender exactamente cómo se debe hacer y, en particular, cómo manejar el bosque de árboles. Después de leer la descripción de Wikipedia de Conjuntos disjuntos y después de leer la descripción en Introducción a los algoritmos (Cormen et al), se me ocurrió lo siguiente:

class DisjointSet { public: class Node { public: int data; int rank; Node* parent; Node() : data(0), rank(0), parent(this) { } // the constructor does MakeSet }; Node* find(Node*); Node* merge(Node*, Node*); // Union }; DisjointSet::Node* DisjointSet::find(DisjointSet::Node* n) { if (n != n->parent) { n->parent = find(n->parent); } return n->parent; } DisjointSet::Node* DisjointSet::merge(DisjointSet::Node* x, DisjointSet::Node* y) { x = find(x); y = find(y); if (x->rank > y->rank) { y->parent = x; } else { x->parent = y; if (x->rank == y->rank) { ++(y->rank); } } }

Estoy bastante seguro de que esto está incompleto y de que me estoy perdiendo algo.

Introducción a los algoritmos menciona que debería haber un bosque de árboles, pero no da ninguna explicación para una implementación práctica de este bosque. Vi CS 61B Lecture 31: Disjoint Sets ( http://www.youtube.com/watch?v=wSPAjGfDl7Q ) y aquí el profesor utiliza solo una matriz para almacenar tanto el bosque como todos sus árboles y valores. No hay un tipo de clase ''Nodo'' explícito como lo he mencionado. También he encontrado muchas otras fuentes (no puedo publicar más de un enlace), que también utilizan esta técnica. Me encantaría hacer esto, excepto que se basa en los índices de la matriz para la búsqueda y como quiero almacenar valores de tipo que no sean int, necesito usar otra cosa (std :: map me viene a la mente).

Otro problema del que no estoy seguro es sobre la asignación de memoria porque estoy usando C ++. Estoy almacenando árboles de punteros y mi operación MakeSet será: nuevo DisjointSet :: Node; . Ahora, estos nodos solo tienen punteros a sus padres, así que no estoy seguro de cómo encontrar el fondo de un árbol. ¿Cómo podré atravesar mis árboles para desasignarlos a todos?

Entiendo el concepto básico de esta estructura de datos, pero estoy un poco confundido acerca de la implementación. Cualquier consejo o sugerencia sería bienvenido, gracias.

Boost tiene una implementación de conjunto disjunto:

http://www.boost.org/doc/libs/1_54_0/libs/disjoint_sets/disjoint_sets.html

Echa un vistazo a este código

class Node { int id,rank,data; Node *parent; public : Node(int id,int data) { this->id = id; this->data = data; this->rank =0; this->parent = this; } friend class DisjointSet; }; class DisjointSet { unordered_map<int,Node*> forest; Node *find_set_helper(Node *aNode) { if( aNode->parent != aNode) aNode->parent = find_set_helper(aNode->parent); return aNode->parent; } void link(Node *xNode,Node *yNode) { if( xNode->rank > yNode->rank) yNode->parent = xNode; else if(xNode-> rank < yNode->rank) xNode->parent = yNode; else { xNode->parent = yNode; yNode->rank++; } } public: DisjointSet() { } void make_set(int id,int data) { Node *aNode = new Node(id,data); this->forest.insert(make_pair(id,aNode)); } void Union(int xId, int yId) { Node *xNode = find_set(xId); Node *yNode = find_set(yId); if(xNode && yNode) link(xNode,yNode); } Node* find_set(int id) { unordered_map<int,Node*> :: iterator itr = this->forest.find(id); if(itr == this->forest.end()) return NULL; return this->find_set_helper(itr->second); } void print() { unordered_map<int,Node*>::iterator itr; for(itr = forest.begin(); itr != forest.end(); itr++) { cout<<"/nid : "<<itr->second->id<<" parent :"<<itr->second->parent->id; } } ~DisjointSet(){ unordered_map<int,Node*>::iterator itr; for(itr = forest.begin(); itr != forest.end(); itr++) { delete (itr->second); } } };

El siguiente código parece fácil de entender para implementar conjuntos de separaciones de búsqueda de unión por compresión de ruta

int find(int i) { if(parent[i]==i) return i; else return parent[i]=find(parent[i]); } void union(int a,int b) { x=find(a);y=find(b); if(x!=y) { if(rank[x]>rank[y]) parent[y]=x; else { parent[x]=y; if(rank[x]==rank[y]) rank[y]+=1; } } }

Este artículo de blog muestra una implementación de C ++ utilizando la compresión de ruta: http://toughprogramming.blogspot.com/2013/04/implementing-disjoint-sets-in-c.html

No es una implementación perfecta de ninguna manera (¡la escribí después de todo!), ¿Pero esto ayuda?

/*** * millipede: DisjointSetForest.h * Copyright Stuart Golodetz, 2009. All rights reserved. ***/ #ifndef H_MILLIPEDE_DISJOINTSETFOREST #define H_MILLIPEDE_DISJOINTSETFOREST #include <map> #include <common/exceptions/Exception.h> #include <common/io/util/OSSWrapper.h> #include <common/util/NullType.h> namespace mp { /** @brief A disjoint set forest is a fairly standard data structure used to represent the partition of a set of elements into disjoint sets in such a way that common operations such as merging two sets together are computationally efficient. This implementation uses the well-known union-by-rank and path compression optimizations, which together yield an amortised complexity for key operations of O(a(n)), where a is the (extremely slow-growing) inverse of the Ackermann function. The implementation also allows clients to attach arbitrary data to each element, which can be useful for some algorithms. @tparam T The type of data to attach to each element (arbitrary) */ template <typename T = NullType> class DisjointSetForest { //#################### NESTED CLASSES #################### private: struct Element { T m_value; int m_parent; int m_rank; Element(const T& value, int parent) : m_value(value), m_parent(parent), m_rank(0) {} }; //#################### PRIVATE VARIABLES #################### private: mutable std::map<int,Element> m_elements; int m_setCount; //#################### CONSTRUCTORS #################### public: /** @brief Constructs an empty disjoint set forest. */ DisjointSetForest() : m_setCount(0) {} /** @brief Constructs a disjoint set forest from an initial set of elements and their associated values. @param[in] initialElements A map from the initial elements to their associated values */ explicit DisjointSetForest(const std::map<int,T>& initialElements) : m_setCount(0) { add_elements(initialElements); } //#################### PUBLIC METHODS #################### public: /** @brief Adds a single element x (and its associated value) to the disjoint set forest. @param[in] x The index of the element @param[in] value The value to initially associate with the element @pre - x must not already be in the disjoint set forest */ void add_element(int x, const T& value = T()) { m_elements.insert(std::make_pair(x, Element(value, x))); ++m_setCount; } /** @brief Adds multiple elements (and their associated values) to the disjoint set forest. @param[in] elements A map from the elements to add to their associated values @pre - None of the elements to be added must already be in the disjoint set forest */ void add_elements(const std::map<int,T>& elements) { for(typename std::map<int,T>::const_iterator it=elements.begin(), iend=elements.end(); it!=iend; ++it) { m_elements.insert(std::make_pair(it->first, Element(it->second, it->first))); } m_setCount += elements.size(); } /** @brief Returns the number of elements in the disjoint set forest. @return As described */ int element_count() const { return static_cast<int>(m_elements.size()); } /** @brief Finds the index of the root element of the tree containing x in the disjoint set forest. @param[in] x The element whose set to determine @pre - x must be an element in the disjoint set forest @throw Exception - If the precondition is violated @return As described */ int find_set(int x) const { Element& element = get_element(x); int& parent = element.m_parent; if(parent != x) { parent = find_set(parent); } return parent; } /** @brief Returns the current number of disjoint sets in the forest (i.e. the current number of trees). @return As described */ int set_count() const { return m_setCount; } /** @brief Merges the disjoint sets containing elements x and y. If both elements are already in the same disjoint set, this is a no-op. @param[in] x The first element @param[in] y The second element @pre - Both x and y must be elements in the disjoint set forest @throw Exception - If the precondition is violated */ void union_sets(int x, int y) { int setX = find_set(x); int setY = find_set(y); if(setX != setY) link(setX, setY); } /** @brief Returns the value associated with element x. @param[in] x The element whose value to return @pre - x must be an element in the disjoint set forest @throw Exception - If the precondition is violated @return As described */ T& value_of(int x) { return get_element(x).m_value; } /** @brief Returns the value associated with element x. @param[in] x The element whose value to return @pre - x must be an element in the disjoint set forest @throw Exception - If the precondition is violated @return As described */ const T& value_of(int x) const { return get_element(x).m_value; } //#################### PRIVATE METHODS #################### private: Element& get_element(int x) const { typename std::map<int,Element>::iterator it = m_elements.find(x); if(it != m_elements.end()) return it->second; else throw Exception(OSSWrapper() << "No such element: " << x); } void link(int x, int y) { Element& elementX = get_element(x); Element& elementY = get_element(y); int& rankX = elementX.m_rank; int& rankY = elementY.m_rank; if(rankX > rankY) { elementY.m_parent = x; } else { elementX.m_parent = y; if(rankX == rankY) ++rankY; } --m_setCount; } }; } #endif

No puedo hablar sobre el algoritmo, pero para la administración de la memoria, normalmente usará algo llamado puntero inteligente, que liberará lo que apunta. Puede obtener propiedad compartida y punteros inteligentes de propiedad única, y no propiedad también. El uso correcto de estos no garantizará problemas de memoria.

Para implementar Disjoint Sets desde cero, recomiendo leer el libro de Estructuras de datos y análisis de algoritmos en C ++ de Mark A. Weiss .

En el Capítulo 8, comienza desde la búsqueda / unión básica, luego agrega gradualmente unión por altura / profundidad / rango y encuentra la compresión. Al final, proporciona análisis Big-O.

Confíe en mí, tiene todo lo que desea saber acerca de los Conjuntos Disjoint y su implementación en C ++.

Si está intentando preguntar qué estilo es mejor para el conjunto de separación (vector o mapa (árbol rb)), es posible que tenga algo que agregar

make_set (int key , node info ) : esta es generalmente una función miembro para (1) agregar un nodo y (2) hacer que el nodo apunte a sí mismo (parent = key), esto crea inicialmente un conjunto disjoint. complejidad del tiempo de operación para el vector O (n), para el mapa O (n * logn).
find_set( int key ) : esto generalmente tiene dos funciones: (1) encontrar el nodo a través de la clave dada (2) la compresión de ruta. Realmente no podría calcularlo para la compresión de la ruta, pero simplemente buscando el nodo, la complejidad del tiempo para (1) el vector O (1) y para (2) el mapa O (registro (n)).

Para concluir, me gustaría decir que aunque al mirar aquí, la implementación del vector se ve mejor, las complejidades de tiempo de ambos son O (M * α (n)) ≈ O (M * 5) o eso he leído.

PD. verifico lo que he escrito, aunque estoy bastante seguro de que es correcto.

Su implementación está bien. Todo lo que necesita hacer ahora es mantener una serie de nodos de conjuntos separados para que pueda llamar a su unión / encontrar métodos sobre ellos.

Para el algoritmo de Kruskal, necesitará una matriz que contenga un Nodo de conjunto disjunto por vértice de gráfico. Luego, cuando busque el siguiente borde para agregar a su subgrafo, usaría el método de búsqueda para verificar si esos nodos ya están en su subgrafo. Si lo son, entonces puedes pasar al siguiente borde. De lo contrario, es hora de agregar ese borde a su subgrafo y realizar una operación de unión entre los dos vértices conectados por ese borde.

su implementación se ve bien (excepto en la función merge, debería declarar la devolución nula o poner una devolución allí, prefiero devolverla nula). La cosa es que necesitas rastrear los Nodes* . Puede hacerlo teniendo un vector<DisjointSet::Node*> en su clase DisjointSet o teniendo este vector en otro lugar y declarando los métodos de DisjointSet como static .

Este es un ejemplo de ejecución (tenga en cuenta que cambié la combinación para devolver el vacío y no modifiqué los métodos de DisjointSet para que sean static :

int main() { vector<DisjointSet::Node*> v(10); DisjointSet ds; for (int i = 0; i < 10; ++i) { v[i] = new DisjointSet::Node(); v[i]->data = i; } int x, y; while (cin >> x >> y) { ds.merge(v[x], v[y]); } for (int i = 0; i < 10; ++i) { cout << v[i]->data << '' '' << v[i]->parent->data << endl; } return 0; }

Con esta entrada:

3 1 1 2 2 4 0 7 8 9

Imprime lo esperado:

0 7 1 1 2 1 3 1 4 1 5 5 6 6 7 7 8 9 9 9

Tu bosque es la composición de los árboles:

7 1 5 6 9 / / | / | 0 2 3 4 8

Por lo tanto, su algoritmo es bueno, tenga la mejor complejidad para Union-find, que yo sepa, y usted realiza un seguimiento de sus Node en su vector . Así que simplemente podría desasignar:

for (int i = 0; i < int(v.size()); ++i) { delete v[i]; }