algorithm - listas - Fusionar Ordenar una lista enlazada

ordenar lista enlazada c++ (15)

Hace poco me refería a algunos aspectos fundamentales y encontré que fusionar una lista vinculada era un desafío bastante bueno. Si tiene una buena implementación, muéstrela aquí.

Aquí está mi implementación de Knuth''s "List merge sort" (Algoritmo 5.2.4L del Vol.3 de TAOCP, 2nd ed.). Agregaré algunos comentarios al final, pero aquí hay un resumen:

En la entrada aleatoria, se ejecuta un poco más rápido que el código de Simon Tatham (ver la respuesta no recursiva de Dave Gamble, con un enlace) pero un poco más lento que el código recursivo de Dave Gamble. Es más difícil de entender que cualquiera de los dos. Al menos en mi implementación, requiere que cada elemento tenga DOS punteros a los elementos. (Una alternativa sería un puntero y un indicador booleano). Por lo tanto, probablemente no sea un enfoque útil. Sin embargo, un punto distintivo es que funciona muy rápido si la entrada tiene tramos largos que ya están ordenados.

element *knuthsort(element *list) { /* This is my attempt at implementing Knuth''s Algorithm 5.2.4L "List merge sort" from Vol.3 of TAOCP, 2nd ed. */ element *p, *pnext, *q, *qnext, head1, head2, *s, *t; if(!list) return NULL; L1: /* This is the clever L1 from exercise 12, p.167, solution p.647. */ head1.next=list; t=&head2; for(p=list, pnext=p->next; pnext; p=pnext, pnext=p->next) { if( cmp(p,pnext) > 0 ) { t->next=NULL; t->spare=pnext; t=p; } } t->next=NULL; t->spare=NULL; p->spare=NULL; head2.next=head2.spare; L2: /* begin a new pass: */ t=&head2; q=t->next; if(!q) return head1.next; s=&head1; p=s->next; L3: /* compare: */ if( cmp(p,q) > 0 ) goto L6; L4: /* add p onto the current end, s: */ if(s->next) s->next=p; else s->spare=p; s=p; if(p->next) { p=p->next; goto L3; } else p=p->spare; L5: /* complete the sublist by adding q and all its successors: */ s->next=q; s=t; for(qnext=q->next; qnext; q=qnext, qnext=q->next); t=q; q=q->spare; goto L8; L6: /* add q onto the current end, s: */ if(s->next) s->next=q; else s->spare=q; s=q; if(q->next) { q=q->next; goto L3; } else q=q->spare; L7: /* complete the sublist by adding p and all its successors: */ s->next=p; s=t; for(pnext=p->next; pnext; p=pnext, pnext=p->next); t=p; p=p->spare; L8: /* is this end of the pass? */ if(q) goto L3; if(s->next) s->next=p; else s->spare=p; t->next=NULL; t->spare=NULL; goto L2; }

Aquí hay una versión recursiva alternativa. Esto no necesita avanzar a lo largo de la lista para dividirlo: suministramos un puntero a un elemento principal (que no forma parte del género) y una longitud, y la función recursiva devuelve un puntero al final de la lista ordenada.

element* mergesort(element *head,long lengtho) { long count1=(lengtho/2), count2=(lengtho-count1); element *next1,*next2,*tail1,*tail2,*tail; if (lengtho<=1) return head->next; /* Trivial case. */ tail1 = mergesort(head,count1); tail2 = mergesort(tail1,count2); tail=head; next1 = head->next; next2 = tail1->next; tail1->next = tail2->next; /* in case this ends up as the tail */ while (1) { if(cmp(next1,next2)<=0) { tail->next=next1; tail=next1; if(--count1==0) { tail->next=next2; return tail2; } next1=next1->next; } else { tail->next=next2; tail=next2; if(--count2==0) { tail->next=next1; return tail1; } next2=next2->next; } } }

Basada en el EXCELENTE código de: http://www.chiark.greenend.org.uk/~sgtatham/algorithms/listsort.html

Recortado ligeramente, y arreglado:

typedef struct _aList { struct _aList* next; struct _aList* prev; // Optional. // some data } aList; aList *merge_sort_list(aList *list,int (*compare)(aList *one,aList *two)) { int listSize=1,numMerges,leftSize,rightSize; aList *tail,*left,*right,*next; if (!list || !list->next) return list; // Trivial case do { // For each power of two<=list length numMerges=0,left=list;tail=list=0; // Start at the start while (left) { // Do this list_len/listSize times: numMerges++,right=left,leftSize=0,rightSize=listSize; // Cut list into two halves (but don''t overrun) while (right && leftSize<listSize) leftSize++,right=right->next; // Run through the lists appending onto what we have so far. while (leftSize>0 || (rightSize>0 && right)) { // Left empty, take right OR Right empty, take left, OR compare. if (!leftSize) {next=right;right=right->next;rightSize--;} else if (!rightSize || !right) {next=left;left=left->next;leftSize--;} else if (compare(left,right)<0) {next=left;left=left->next;leftSize--;} else {next=right;right=right->next;rightSize--;} // Update pointers to keep track of where we are: if (tail) tail->next=next; else list=next; // Sort prev pointer next->prev=tail; // Optional. tail=next; } // Right is now AFTER the list we just sorted, so start the next sort there. left=right; } // Terminate the list, double the list-sort size. tail->next=0,listSize<<=1; } while (numMerges>1); // If we only did one merge, then we just sorted the whole list. return list; }

NB: Esto es O (NLog (N)) garantizado, y utiliza O (1) recursos (sin recursión, sin pila, nada).

Decidí probar los ejemplos aquí, y también un enfoque más, escrito originalmente por Jonathan Cunningham en Pop-11. Codifiqué todos los enfoques en C # e hice una comparación con un rango de diferentes tamaños de lista. Comparé el enfoque Mono eglib de Raja R Harinath, el código C # de Shital Shah, el enfoque Java de Jayadev, las versiones recursivas y no recursivas de David Gamble, el primer código C de Ed Wynn (esto se estrelló con mi conjunto de datos de muestra, No depuré), y la versión de Cunningham. El código completo aquí: https://gist.github.com/314e572808f29adb0e41.git .

Mono eglib se basa en una idea similar a la de Cunningham y es de velocidad comparable, a menos que la lista ya esté ordenada, en cuyo caso el enfoque de Cunningham es mucho más rápido (si se clasifica parcialmente, el eglib es ligeramente más rápido). El código eglib utiliza una tabla fija para mantener la recursión de ordenación combinada, mientras que el enfoque de Cunningham funciona mediante el uso de niveles de recursión crecientes, por lo que comienza sin recurrencia, luego recursividad de 1 profundidad, luego recursión de 2 profundidades y así sucesivamente, de acuerdo con cuántos pasos son necesarios para hacer el tipo. Encuentro que el código de Cunningham es un poco más fácil de seguir y no hay que adivinar lo grande que es hacer la tabla de recursión, así que obtiene mi voto. Los otros enfoques que probé en esta página fueron dos o más veces más lentos.

Aquí está el puerto C # del tipo Pop-11:

/// <summary> /// Sort a linked list in place. Returns the sorted list. /// Originally by Jonathan Cunningham in Pop-11, May 1981. /// Ported to C# by Jon Meyer. /// </summary> public class ListSorter<T> where T : IComparable<T> { SingleListNode<T> workNode = new SingleListNode<T>(default(T)); SingleListNode<T> list; /// <summary> /// Sorts a linked list. Returns the sorted list. /// </summary> public SingleListNode<T> Sort(SingleListNode<T> head) { if (head == null) throw new NullReferenceException("head"); list = head; var run = GetRun(); // get first run // As we progress, we increase the recursion depth. var n = 1; while (list != null) { var run2 = GetSequence(n); run = Merge(run, run2); n++; } return run; } // Get the longest run of ordered elements from list. // The run is returned, and list is updated to point to the // first out-of-order element. SingleListNode<T> GetRun() { var run = list; // the return result is the original list var prevNode = list; var prevItem = list.Value; list = list.Next; // advance to the next item while (list != null) { var comp = prevItem.CompareTo(list.Value); if (comp > 0) { // reached end of sequence prevNode.Next = null; break; } prevItem = list.Value; prevNode = list; list = list.Next; } return run; } // Generates a sequence of Merge and GetRun() operations. // If n is 1, returns GetRun() // If n is 2, returns Merge(GetRun(), GetRun()) // If n is 3, returns Merge(Merge(GetRun(), GetRun()), // Merge(GetRun(), GetRun())) // and so on. SingleListNode<T> GetSequence(int n) { if (n < 2) { return GetRun(); } else { n--; var run1 = GetSequence(n); if (list == null) return run1; var run2 = GetSequence(n); return Merge(run1, run2); } } // Given two ordered lists this returns a list that is the // result of merging the two lists in-place (modifying the pairs // in list1 and list2). SingleListNode<T> Merge(SingleListNode<T> list1, SingleListNode<T> list2) { // we reuse a single work node to hold the result. // Simplifies the number of test cases in the code below. var prevNode = workNode; while (true) { if (list1.Value.CompareTo(list2.Value) <= 0) { // list1 goes first prevNode.Next = list1; prevNode = list1; if ((list1 = list1.Next) == null) { // reached end of list1 - join list2 to prevNode prevNode.Next = list2; break; } } else { // same but for list2 prevNode.Next = list2; prevNode = list2; if ((list2 = list2.Next) == null) { prevNode.Next = list1; break; } } } // the result is in the back of the workNode return workNode.Next; } }

Esta es la pieza entera de código que muestra cómo podemos crear una lista de enlaces en java y ordenarla usando la clasificación Merge. Estoy creando un nodo en la clase MergeNode y hay otra clase MergesortLinklist donde hay división y fusión de lógica.

class MergeNode { Object value; MergeNode next; MergeNode(Object val) { value = val; next = null; } MergeNode() { value = null; next = null; } public Object getValue() { return value; } public void setValue(Object value) { this.value = value; } public MergeNode getNext() { return next; } public void setNext(MergeNode next) { this.next = next; } @Override public String toString() { return "MergeNode [value=" + value + ", next=" + next + "]"; } } public class MergesortLinkList { MergeNode head; static int totalnode; public MergeNode getHead() { return head; } public void setHead(MergeNode head) { this.head = head; } MergeNode add(int i) { // TODO Auto-generated method stub if (head == null) { head = new MergeNode(i); // System.out.println("head value is "+head); return head; } MergeNode temp = head; while (temp.next != null) { temp = temp.next; } temp.next = new MergeNode(i); return head; } MergeNode mergesort(MergeNode nl1) { // TODO Auto-generated method stub if (nl1.next == null) { return nl1; } int counter = 0; MergeNode temp = nl1; while (temp != null) { counter++; temp = temp.next; } System.out.println("total nodes " + counter); int middle = (counter - 1) / 2; temp = nl1; MergeNode left = nl1, right = nl1; int leftindex = 0, rightindex = 0; if (middle == leftindex) { right = left.next; } while (leftindex < middle) { leftindex++; left = left.next; right = left.next; } left.next = null; left = nl1; System.out.println(left.toString()); System.out.println(right.toString()); MergeNode p1 = mergesort(left); MergeNode p2 = mergesort(right); MergeNode node = merge(p1, p2); return node; } MergeNode merge(MergeNode p1, MergeNode p2) { // TODO Auto-generated method stub MergeNode L = p1; MergeNode R = p2; int Lcount = 0, Rcount = 0; MergeNode tempnode = null; while (L != null && R != null) { int val1 = (int) L.value; int val2 = (int) R.value; if (val1 > val2) { if (tempnode == null) { tempnode = new MergeNode(val2); R = R.next; } else { MergeNode store = tempnode; while (store.next != null) { store = store.next; } store.next = new MergeNode(val2); R = R.next; } } else { if (tempnode == null) { tempnode = new MergeNode(val1); L = L.next; } else { MergeNode store = tempnode; while (store.next != null) { store = store.next; } store.next = new MergeNode(val1); L = L.next; } } } MergeNode handle = tempnode; while (L != null) { while (handle.next != null) { handle = handle.next; } handle.next = L; L = null; } // Copy remaining elements of L[] if any while (R != null) { while (handle.next != null) { handle = handle.next; } handle.next = R; R = null; } System.out.println("----------------sorted value-----------"); System.out.println(tempnode.toString()); return tempnode; } public static void main(String[] args) { MergesortLinkList objsort = new MergesortLinkList(); MergeNode n1 = objsort.add(9); MergeNode n2 = objsort.add(7); MergeNode n3 = objsort.add(6); MergeNode n4 = objsort.add(87); MergeNode n5 = objsort.add(16); MergeNode n6 = objsort.add(81); MergeNode n7 = objsort.add(21); MergeNode n8 = objsort.add(16); MergeNode n9 = objsort.add(99); MergeNode n10 = objsort.add(31); MergeNode val = objsort.mergesort(n1); System.out.println("===============sorted values====================="); while (val != null) { System.out.println(" value is " + val.value); val = val.next; } } }

Hay un mergesort de lista enlazada no recursivo en mono eglib .

La idea básica es que el lazo de control para las diversas fusiones es paralelo al incremento en bits de un entero binario. Hay O (n) fusiones para "insertar" n nodos en el árbol de fusión, y el rango de esas fusiones corresponde al dígito binario que se incrementa. Usando esta analogía, solo los nodos O (log n) del árbol de fusión deben materializarse en una matriz de espera temporal.

La más simple es de Gonnet + Baeza Yates Handbook of Algorithms . Usted lo llama con la cantidad de elementos ordenados que desee, que recursivamente se bisecta hasta que llega a una solicitud de una lista de tamaño uno que luego se quita al principio de la lista original. Todos estos se fusionan en una lista clasificada de tamaño completo.

[Tenga en cuenta que el genial basado en la pila en la primera publicación se llama Online Mergesort y recibe la menor mención en un ejercicio en Knuth Vol 3]

Me obsesioné por optimizar el desorden para este algoritmo y debajo es a lo que finalmente llegué. Muchos otros códigos en Internet y son horriblemente malos. Hay personas tratando de obtener el punto medio de la lista, haciendo recursión, teniendo múltiples bucles para los nodos sobrantes, manteniendo conteos de toneladas de cosas, TODO lo cual es innecesario. MergeSort se adapta naturalmente a la lista enlazada y el algoritmo puede ser hermoso y compacto, pero no es trivial llegar a ese estado.

El código siguiente mantiene un número mínimo de variables y tiene un número mínimo de pasos lógicos necesarios para el algoritmo (es decir, sin hacer que el código no se pueda mantener / ilegible) hasta donde yo sé. Sin embargo, no he intentado minimizar el LOC y guardé tanto espacio en blanco como sea necesario para mantener las cosas legibles. He probado este código a través de pruebas unitarias bastante rigurosas.

Tenga en cuenta que esta respuesta combina pocas técnicas de otra respuesta https://.com/a/3032462/207661 . Mientras que el código está en C #, debería ser trivial convertirlo a C ++, Java, etc.

SingleListNode<T> SortLinkedList<T>(SingleListNode<T> head) where T : IComparable<T> { int blockSize = 1, blockCount; do { //Maintain two lists pointing to two blocks, left and right SingleListNode<T> left = head, right = head, tail = null; head = null; //Start a new list blockCount = 0; //Walk through entire list in blocks of size blockCount while (left != null) { blockCount++; //Advance right to start of next block, measure size of left list while doing so int leftSize = 0, rightSize = blockSize; for (;leftSize < blockSize && right != null; ++leftSize) right = right.Next; //Merge two list until their individual ends bool leftEmpty = leftSize == 0, rightEmpty = rightSize == 0 || right == null; while (!leftEmpty || !rightEmpty) { SingleListNode<T> smaller; //Using <= instead of < gives us sort stability if (rightEmpty || (!leftEmpty && left.Value.CompareTo(right.Value) <= 0)) { smaller = left; left = left.Next; --leftSize; leftEmpty = leftSize == 0; } else { smaller = right; right = right.Next; --rightSize; rightEmpty = rightSize == 0 || right == null; } //Update new list if (tail != null) tail.Next = smaller; else head = smaller; tail = smaller; } //right now points to next block for left left = right; } //terminate new list, take care of case when input list is null if (tail != null) tail.Next = null; //Lg n iterations blockSize <<= 1; } while (blockCount > 1); return head; }

Puntos de interés

No hay un manejo especial para casos como la lista nula de la lista de 1, etc. requerida. Estos casos "simplemente funciona".
Muchos textos de algoritmos "estándar" tienen dos bucles para repasar los elementos sobrantes para manejar el caso cuando una lista es más corta que otra. El código anterior elimina la necesidad de hacerlo.
Nos aseguramos de que el género sea estable
El ciclo while interno que es un punto caliente evalúa 3 expresiones por iteración en promedio, lo que creo que es lo mínimo que se puede hacer.

Actualización: @ ideasman42 ha traducido el código anterior a C / C ++ junto con sugerencias para corregir comentarios y mejorar un poco más. El código anterior está actualizado con estos.

Me pregunto por qué debería ser un gran desafío, como se afirma aquí, aquí hay una implementación directa en Java sin ningún "truco inteligente".

//The main function public Node merge_sort(Node head) { if(head == null || head.next == null) { return head; } Node middle = getMiddle(head); //get the middle of the list Node sHalf = middle.next; middle.next = null; //split the list into two halfs return merge(merge_sort(head),merge_sort(sHalf)); //recurse on that } //Merge subroutine to merge two sorted lists public Node merge(Node a, Node b) { Node dummyHead, curr; dummyHead = new Node(); curr = dummyHead; while(a !=null && b!= null) { if(a.info <= b.info) { curr.next = a; a = a.next; } else { curr.next = b; b = b.next; } curr = curr.next; } curr.next = (a == null) ? b : a; return dummyHead.next; } //Finding the middle element of the list for splitting public Node getMiddle(Node head) { if(head == null) { return head; } Node slow, fast; slow = fast = head; while(fast.next != null && fast.next.next != null) { slow = slow.next; fast = fast.next.next; } return slow; }

Otro ejemplo de ordenación de fusión no recursiva para listas vinculadas, donde las funciones no son parte de una clase. Este código de ejemplo y HP / Microsoft std :: list :: sort usan el mismo algoritmo básico. Una ordenación ascendente, no recursiva, de fusión que utiliza una pequeña serie (26 a 32) de punteros a los primeros nodos de una lista, donde la matriz [i] es 0 o apunta a una lista de tamaño 2 a la potencia i . En mi sistema, Intel 2600K 3.4ghz, puede ordenar 4 millones de nodos con enteros sin signo de 32 bits como datos en aproximadamente 1 segundo.

NODE * MergeLists(NODE *, NODE *); /* prototype */ /* sort a list using array of pointers to list */ /* aList[i] == NULL or ptr to list with 2^i nodes */ #define NUMLISTS 32 /* number of lists */ NODE * SortList(NODE *pList) { NODE * aList[NUMLISTS]; /* array of lists */ NODE * pNode; NODE * pNext; int i; if(pList == NULL) /* check for empty list */ return NULL; for(i = 0; i < NUMLISTS; i++) /* init array */ aList[i] = NULL; pNode = pList; /* merge nodes into array */ while(pNode != NULL){ pNext = pNode->next; pNode->next = NULL; for(i = 0; (i < NUMLISTS) && (aList[i] != NULL); i++){ pNode = MergeLists(aList[i], pNode); aList[i] = NULL; } if(i == NUMLISTS) /* don''t go beyond end of array */ i--; aList[i] = pNode; pNode = pNext; } pNode = NULL; /* merge array into one list */ for(i = 0; i < NUMLISTS; i++) pNode = MergeLists(aList[i], pNode); return pNode; } /* merge two already sorted lists */ /* compare uses pSrc2 < pSrc1 to follow the STL rule */ /* of only using < and not <= */ NODE * MergeLists(NODE *pSrc1, NODE *pSrc2) { NODE *pDst = NULL; /* destination head ptr */ NODE **ppDst = &pDst; /* ptr to head or prev->next */ if(pSrc1 == NULL) return pSrc2; if(pSrc2 == NULL) return pSrc1; while(1){ if(pSrc2->data < pSrc1->data){ /* if src2 < src1 */ *ppDst = pSrc2; pSrc2 = *(ppDst = &(pSrc2->next)); if(pSrc2 == NULL){ *ppDst = pSrc1; break; } } else { /* src1 <= src2 */ *ppDst = pSrc1; pSrc1 = *(ppDst = &(pSrc1->next)); if(pSrc1 == NULL){ *ppDst = pSrc2; break; } } } return pDst; }

Puede usar esa implementación de ordenar por fusión y escribir sus propias funciones para interactuar con la lista vinculada como si fuera una matriz.

Una forma interesante es mantener una pila, y solo fusionar si la lista en la pila tiene el mismo número de elementos, y de lo contrario empujar la lista, hasta que se quede sin elementos en la lista entrante, y luego combinar la pila.

Una implementación más simple / clara podría ser la implementación recursiva, a partir de la cual el tiempo de ejecución de NLog (N) es más claro.

typedef struct _aList { struct _aList* next; struct _aList* prev; // Optional. // some data } aList; aList* merge_sort_list_recursive(aList *list,int (*compare)(aList *one,aList *two)) { // Trivial case. if (!list || !list->next) return list; aList *right = list, *temp = list, *last = list, *result = 0, *next = 0, *tail = 0; // Find halfway through the list (by running two pointers, one at twice the speed of the other). while (temp && temp->next) { last = right; right = right->next; temp = temp->next->next; } // Break the list in two. (prev pointers are broken here, but we fix later) last->next = 0; // Recurse on the two smaller lists: list = merge_sort_list_recursive(list, compare); right = merge_sort_list_recursive(right, compare); // Merge: while (list || right) { // Take from empty lists, or compare: if (!right) { next = list; list = list->next; } else if (!list) { next = right; right = right->next; } else if (compare(list, right) < 0) { next = list; list = list->next; } else { next = right; right = right->next; } if (!result) { result=next; } else { tail->next=next; } next->prev = tail; // Optional. tail = next; } return result; }

NB: Esto tiene un requisito de almacenamiento de Log (N) para la recursión. El rendimiento debería ser aproximadamente comparable con la otra estrategia que publiqué. Aquí hay una optimización potencial ejecutando el bucle de fusión while (list && right) y agregando simplemente la lista restante (ya que realmente no nos importa el final de las listas, saber que están fusionadas es suficiente).

Uno de los inconvenientes de la clase de combinación es que utiliza un espacio O (n) para almacenar los datos. es decir, cuando combina las dos sublistas para la lista vinculada, esto se puede evitar cambiando continuamente el siguiente puntero en el nodo de la lista. La última implementación parece clara pero no la considera.

public int[] msort(int[] a) { if (a.Length > 1) { int min = a.Length / 2; int max = min; int[] b = new int[min]; int[] c = new int[max]; // dividing main array into two half arrays for (int i = 0; i < min; i++) { b[i] = a[i]; } for (int i = min; i < min + max; i++) { c[i - min] = a[i]; } b = msort(b); c = msort(c); int x = 0; int y = 0; int z = 0; while (b.Length != y && c.Length != z) { if (b[y] < c[z]) { a[x] = b[y]; //r-- x++; y++; } else { a[x] = c[z]; x++; z++; } } while (b.Length != y) { a[x] = b[y]; x++; y++; } while (c.Length != z) { a[x] = c[z]; x++; z++; } } return a; }