From: "Dann Corbit" <a-cnadc@microsoft.com>
Newsgroups: sci.math
Subject: Re: Searchin for good sorting alrorithm
Date: Wed, 7 Aug 1996 09:44:55 -0700

Heap Sort (average & worst case behavior O(nlogn))
	generally slower than quick sort, on average.

Quick Sort (average case behavior O(nlogn) )
	Worst case behavior O(n^2)

Singleton's Sort - ACM Algorithm 347 
	(average case behavior O(nlogn) ) 
	generally faster than quick sort.

Counting Sort (average and worst case behavior O(n))

Considerations:
If you have a long key or a small data set,
ACM 347 will probably be fastest.
If you have a short key or a huge data set, 
counting sort will probably be fastest.

For enormous data sets, Counting sort will
always be faster for some large n.
If you have only one or two bits in your key,
you might use Radix sort instead of
counting sort.

> Timir Faizullov <artel@netvision.net.il> wrote in article
<3208F4AD.6862@netvision.net.il>...
> In my application I need to sort ASCII (or may be UNICODE)
> strings in some order, lexicographical for example.
> 
> Can anybody jelp me to find reference to high-performance
> algorithm for doing this?
> 
> ===========================================================
> Timir Faizullov
> 
> artel@netvision.net.il
> 
==============================================================================

From: dak@neuroinformatik.ruhr-uni-bochum.de (David Kastrup)
Newsgroups: sci.math
Subject: Re: Searchin for good sorting alrorithm
Date: 08 Aug 1996 12:08:24 +0200

In article <01bb8480.e4693100$fbe3369d@v-cnadc1> "Dann Corbit" <a-cnadc@microsoft.com> writes:

   Heap Sort (average & worst case behavior O(nlogn))
	   generally slower than quick sort, on average.

   Quick Sort (average case behavior O(nlogn) )
	   Worst case behavior O(n^2)

   Singleton's Sort - ACM Algorithm 347 
	   (average case behavior O(nlogn) ) 
	   generally faster than quick sort.

   Counting Sort (average and worst case behavior O(n))

You forgot heap sort: average *and* worst case behaviour O(n log n),
perfectly suited for linked list sorting (good for variable sized
data).

A pity most implementations are suboptimal, but you could try this
one:

--- Cut here ---
This sort routine of mine sorts linked lists in place (only pointers
are copied, not elements itself) and has the following advantages over
quicksort:
sort is stable, order of elements comparing as equal is preserved.
worst case comparisons are merely ceil(lg n) n - 2^(ceil(lg n)) + 1,
which for powers of 2 becomes 1-n+n lg n, which beats quicksort's
average cases for bigger n.
average case is only slightly better than worst case, however, already
sorted lists (reverse or normal) are done with about half the comparisons.
recursion depth is absolutely limited to lg n.
no additional space or data structures are needed as for quicksort.

In addition, the routine has very little organizational overhead in
addition to the comparisons and the link adjustments, and so is
*really* fast. Besides, it is so simple that it will fit most
insctruction caches.

Technique is basically a mergesort, where traversal to the second
sublist is done on the fly while sorting the first sublist. Details
are in the code.

---- Follows listsort.h
/* Include only if you use listsort.c as standalone sort binary.
   Otherwise include listsort.c itself. See description in listsort.c
*/
extern int listlink;
extern int (*listcompare)(void *p1, void *p2);
extern void **listsort(void **head, unsigned n);
---- Follows listsort.c
/* listsort.c
 * Copyright (c) 1992 David Kastrup, Goethestra"se~20/22, D-52064~Aachen,
 *       Germany
 * You are allowed to use this software in any form, even
 * in commercial software, as long as you do not restrain the right of
 * those using your software to obtain this code. That is, you must inform
 * your customer that this piece of code is in your program, and must provide
 * the unmodified source to him at request, at not more than a moderate
 * copying charge. You can save yourself this work if you include this in
 * source in your distribution. It is small enough.
 * 
 * Other than that, you are free to use this software at will.
 */

/* Usage:
As an include file:
Define listptr via typedef as a pointer to the data structure you want to sort.
Define getlink(adr) as a mechanism to fetch the link of the pointer
adr, for example
define getlink(adr) ((adr)->next)
Define listleq(p1,p2) as a comparison operator returning true if the data
to p1 compares as less than or equal to the data to p2. Something like
define listleq(p1,p2) ((p1)->data <= (p2)->data)
THEN include listsort.c
As standalone:
compile listsort.c as a standalone routine.
include listsort.h in your application.
Define a comparison routine with two void pointers as arguments, returning a
value <= 0 for less than or equal comparison. Put a pointer to it in
listcompare.

Put the offset of the link-field of your data-structure into listlink.
Both:
Call listsort with a pointer to your head-pointer of the list. Second argument
is the number of items in the list you want to habe sorted.  The rest of the
list is left intact. listsort returns a pointer to the new head-pointer of the
rest of the list.
*/


/* listptr must be defined to be a pointer to an element in the list.
   If it is not defined by the user, it is defined here as void *, thus
   making possible a generic listsort routine.

   getlink(adr) must return a pointer to the element following the
   one pointed to by adr. getlink(adr) must return an lvalue. If
   the user does not define such a mechanism, this file uses a
   global integer listlink in order to determine the byte offset
   of the linkfield in the structure pointed to by adr.

   You must either define both listptr and getadr, or none.
*/

#ifndef getlink
typedef void *listptr;
int listlink;
#define getlink(adr) (*(void**)((char*)(adr)+listlink))
#endif

/* listleq is a comparison operator checking for the first argument
   of type listptr comparing as less or equal to the second argument.
   If you do not define such a boolean macro, this file defines it as
   a call via the global pointer to such a comparison routine
   listcompare
*/

#ifndef listleq
int (*listcompare)(listptr p1, listptr p2);
#define listleq(p1,p2) ((*listcompare)(p1,p2)<=0)
#endif

/* The sort routine. Arguments are a pointer to the head pointer of
   a list to be sorted, as well as the number of elements to sort.
   Only n elements will be sorted, the rest of the list will not be
   disturbed. listsort returns a pointer to the head pointer of the
   rest of the list, located in the last element of the sorted part
   of the list. Thus if listsort calls itself recursively to sort
   the first half of a list, this call returns the head pointer of
   the second half to be sorted, list traversal thus being done on
   the fly.
*/

listptr *listsort(listptr *head, unsigned n)
{
	register listptr p1, p2;
	listptr *h2, *t2;
	unsigned m;

	switch (n) {
	case 0:
		return head;
/* The trivial case of 0 was included, so that you may say for any
   accumulated list of n elements that is not yet NULL-ended something
   like: *listsort(&head, n) = NULL;
   even if the list is yet empty.
*/
	case 1:
		return &getlink(*head);
/* Sorting one element must be provided, or recursion will fail. This
   is still sort of trivial
*/
	case 2:
		p2 = getlink(p1 = *head);
/* p1 points now to first element, p2 to second */
		if (listleq(p1, p2))
			return &getlink(p2);
/* if they were in order, return the tail link of the second */
		getlink(p1) = getlink(*head=p2);
/* let head point to the second, and the first to the tail of the
   second
*/
		return &getlink(getlink(p2) = p1);
/* and let the second point to the first, returning the taillink of the
   first as tail
*/
/* Sorting two elements is provided for efficiency reasons. You could
   provide more cases fixed-coded as well, but test them out completely:
   they should preserve order of equal elements! AND they should work
   cleanly. And if you provide too much cases, chances are that you
   LOSE efficiency because the gains do not outweigh the disadvantage
   that the code does no longer fit in the processors code cache.
*/
	}
/* Sorry that the default case appears outside of the switch. */
	n -= m = n / 2;
/* n now has length of first sublist, m of second one */
	t2 = listsort(h2 = listsort(head, n), m);
/* first n elements are sorted in *head, remaining m elements
   in *h2, rest of list hangs at *t2
*/
	if (listleq(p1 = *head, p2 = *h2)) {
		do {
			if (!--n)
				return *h2 = p2, t2;
		} while (listleq(p1=*(head=&getlink(p1)), p2));
	}
/* The above caters efficiently for the condition that some or
   all of the first sublist may be smaller than the second sublist
*/

/* The rest does a straight merge on the rest, starting with the
   inclusion of the first element of the second sublist which has
   tested as being smaller than the rest of the first sublist.
*/
	for (;;) {
		*head = p2;
		do {
			if (!--m)
				return *h2 = *t2, *t2 = p1, h2;
		} while (!listleq(p1, p2=*(head=&getlink(p2))));
		*head = p1;
		do {
			if (!--n)
				return *h2 = p2, t2;
		} while (listleq(p1=*(head=&getlink(p1)), p2));
	}
}

-- 
David Kastrup     Institut fuer Neuroinformatik, Ruhr-Universitaet Bochum
Email: dak@neuroinformatik.ruhr-uni-bochum.de  Telephon: +49-234-700-5570

==============================================================================

From: "Daniel Giaimo" <rgiaimo@nospam.ix.netcom.com>
Newsgroups: sci.math
Subject: Re: Quicker!?sort
Date: Fri, 13 Mar 1998 20:30:00 -0800

tric@yyy.or.jp wrote in message <6e552t$nnj$1@nnrp1.dejanews.com>...
>If we sort the data clumsily, we must compare two data n(n+1)/2 times to
>check their order (where n is the total of the data). Taking some smarter
>methods for sort, we can reduce the order of conparisons into n log n. Now
>this method can reduce the order of comparisons into almost n, namely this
>sort is quicker than any other sorts which have the order of comparisons n
>log n such as bubble sort or quick sort! ---really?

    There is a well-known set of sorting algorithms called radix sorts that have
worst-case performance O(n).  The problem with these sorts is that it is
extremely difficult to write an efficient, general-purpose radix sort, and even
when you have, it still isn't faster than quicksort unless you have enormous
arrays to sort.  Are you saying that you've found a way to make a
*general-purpose* sort that has worst-case performance O(n) that is faster than
quicksort?  If so, I'd like to see your algorithm.

    Also, bubble sort and quicksort both have worst-case performance O(n^2), and
bubble sort has average case performance O(n^2).  The only sorting algorithms
that I know of that have O(n*log(n)) worst-case performance are treesort and
heapsort.

--
--Daniel Giaimo
Remove nospam. from my address to e-mail me. | rgiaimo@(nospam.)ix.netcom.com
                                                       ^^^^^^^^^<-(Remove)
--------------------------------------------------------------------------------
"In a race between a rock and a pig, don't varnish your clams."
                                                            --A Wise Elbonian

==============================================================================

From: David Kastrup <dak@mailhost.neuroinformatik.ruhr-uni-bochum.de>
Newsgroups: sci.math
Subject: Re: Quicker!?sort
Date: 16 Mar 1998 10:14:43 +0100

"Daniel Giaimo" <rgiaimo@nospam.ix.netcom.com> writes:

>     Also, bubble sort and quicksort both have worst-case performance
> O(n^2), and bubble sort has average case performance O(n^2).  The
> only sorting algorithms that I know of that have O(n*log(n))
> worst-case performance are treesort and heapsort.

You forgot mergesort, the natural choice of a sort program for linked
lists.  It's O(n log n) in comparisons as well (worst case), best case
of about half of that for basically presorted data, no data needs to
be copied (only pointer changes) and you can combine sorting one
sublist with scanning for the head of the next sublist, making the
stuff pretty efficient.  And stack depth is strictly limited to O(log
n).  With a bit of housekeeping logic you can even sort without
explicit recursion.

I have a dead-simple implementation of it that I use that beats most
library qsorts.


-- 
David Kastrup                                     Phone: +49-234-700-5570
Email: dak@neuroinformatik.ruhr-uni-bochum.de       Fax: +49-234-709-4209
Institut für Neuroinformatik, Universitätsstr. 150, 44780 Bochum, Germany