« Semper Fidelis - USPS Distinguished Marines Stamps | Main| Merry Christmas »

Sorting Names

QuickImage Category Technical

Ben demonstrates some interesting sorting routines and asks for suggestions in this post.


I was originally going to respond there, but I realized that I had already done something similar, and wanted to expound a bit on the topic, so here we go.


The problem: Sort a list of Names, by last name, but display the list as by full name.
Some assumptions have to be made here.

  1. A "last name" is defined as the block of text following the final space character contained within any name entry.
  2. In the event that there is no space characters within an entry (Ex: "Enya"), then the entire name will be sorted as if it were the last name.

I tested Ben's solutions, and none of them seemed to work properly for condition #2. Now, to be fair I added condition #2 myself, and Ben hasn't had the opportunity to code for this solution. I'm pretty certain that had he considered this condition, he would have no problem handling it.


So, how do we solve for this? It really isn't that difficult a problem at all, once we break it down. Given a list of names, we need to do the following:

  1. Restructure each element of the list such that the last name becomes the first part of the element
  2. Sort the list alphanumerically
  3. Restructure each element of the list such that it follows the original format

The list of names we are going to work with is:

  • Joe Litton
  • Cindy Lou Who
  • Mary Jane van der Welten
  • Tom Duff
  • Ben Langhinrichs
  • Harry Belafonte
  • Enya
  • Antonio

Once sorted, the list should appear as:

  • Antonio
  • Harry Belafonte
  • Tom Duff
  • Enya
  • Ben Langhinrichs
  • Joe Litton
  • Mary Jane van der Welten
  • Cindy Lou Who

Once we break it down into individual "sub-problems", it doesn't seem to be that big of a deal. I do a lot of my development this way. Take a big problem, break it down into bite-sized chunks, solve the chunks individually, and then put it all together in the end. Call it what you want (I call it "Pirannah development"), the simple fact is that it works.


So let's look at the solution:
#1 (1st restructure of list) -This shouldn't be too hard. All we need to do is grab the end "block" of each element and move it to the beginning. This is coded as:
vRestructure1 := @Transform(Speaker; "s"; @If(@Contains(s; " "); @RightBack(s; " ") + " " + @LeftBack(s; " "); s));


#2 (sort the list) -This isn't that difficult at all, just a simple call:
vSorted := @Sort(vRestructure1);


Ok, we've got the first two taken care of, now all that is left is
#3 (restructure sorted list) - All we need to do here is essentially reverse the code from step 1:
vRestructure2 := @Transform(vSorted; "s"; @If(@Contains(s; " "); @Right(s; " ") + " " + @Left(s; " "); s));


Well, that was pretty easy. Let's have some fun and put it all into a single line (which gives us nested @Transforms with an @Sort thrown in for good measure -can you feel the GEEK POWER??
vNewList := @Transform(@Sort(@Transform(Speaker; "s"; @If(@Contains(s; " "); @RightBack(s; " ") + " " + @LeftBack(s; " "); s))); "s"; @If(@Contains(s; " "); @Right(s; " ") + " " + @Left(s; " "); s));


For those of you who are interested, I've put this into my SortDemo database, which is available in the downloads area. Hope this helps!


-Devin

Comments

Gravatar Image1 - Devin -

Change "Mary Jane van der Welten" to "Mary Jane van der Zelten" and see what happens. (Hint: in Ben's example, it should sort on the "v", not the "Z")

- Julian

Gravatar Image2 - Julian is right. Your first definition is incorrect, since the last name is either the last word or starts with the first word that has a lowercase letter. If it were not for that, I could use

last := @RightBack(Speaker; " ")+", "+@LeftBack(Speaker; " ");
@RightBack((@Sort(last)+"|")+Speaker); "|");

Gravatar Image3 - @Julian -Ahhh, I see. Changing her name to "Mary Jane van der Zelten" causes her to sort to the last position on the list. Which is correct according to the conditions set for the sort, but incorrect according to our "collective agreement" regarding the definition of a last name. Mary's last name is not "Zelten", but is in fact "van der Zelten". Which of course means that she should sort between Joe Litton and Cindy Lou Who.

@Ben - I think you are right. The "collective agreement" regarding the last name should be defined as: For any given entry; if there are no words beginning with a lowercase character, then the last word, otherwise, the first word beginning with a lowercase character and all words that follow it. Guess I'm going to have to work out a new formula.

Thanks,
-Devin.


Gravatar Image4 - I'm looking forward to it. Sorry I fuzzed on the first set of names and didn't notice that the "van der Welton" would sort where it did anyway. Duh! I must admit, this is turning into a fun exercise.

Gravatar Image5 -

Ok, it looks like I've figured it out.

First thing we need to do is to redefine our conditions.   For any given entry:

  1. "Words" are defined as blocks of text, delimited by a blank space character.
  2. The "last name" is defined as the last word, except in cases where one or more words begin with a lowercase character. In this case, the last name will be defined as the first word beginning with a lowercase character, along with all subsequent text (words and delimiters) for that entry.

Whew, creating an explicit definition of something we all "just get" can be a bit tiresome. Anyway, now that we have our rules, we can go ahead and release our Pirannahs (Parannahie?). The tasks we need to here are identical to before.

  1. Restructure each element of the list such that the last name becomes the first part of the element.
  2. Sort the list alphanumerically.
  3. Restructure each element of the list such that it follows the original format.

What's that you say? How can the tasks be the same? Well, remember here that we have changed our definition of last name. This means that while our tasks are the same, the implementation of those tasks is what has changed. All right, let us examine the code:
#1 (1st restructure of list) - All right, I admit it. This code is a bit tricky, but not too convoluted. The thing here is to take it slow and think carefully about what needs to be done. For any given entry, we need to move the last name to the beginning of the entry. But in this case, the definition of "last name" is what makes this seem hard, when it really isn't that difficult.
TIP When working out a formula that needs to work on a list, I often find that it is easier to "work the code" for a single element, and then apply what I have done to the entire list. So let's break this down a bit. For any given element of the list, we need to

  1. Determine if any word within the name begins with a lowercase letter
  2. If yes, then find that word, and set it along with the rest of the text as the last name
  3. If no, then grab the last word of the name, and set it as the last name
  4. Reorder the text such that the last name (plus a blank space) is now at the beginning. Remember to trim it up so that extra blank spaces are removed.

OK, let us work the code for these steps. Assume that the variable s represents an individual name element from the names list.
#1 (Does a lowercase word exist?) -This is simple. There is a built in @Function designed just for this: @If(@Matches(s; "* {a-z}*"); A word exists; No lowercase-beginning words found);


#2 (A word exists) -Ok, this is the trickiest code chunk in this whole thing. But even this isn't too bad. What we're going to do here is explode the name into a list, and then loop through this list until we find the first word that begins with a lowercase character. Once we find that, we will build our last name string consisting of that word and all subsequent words in the list. The words prior to this word will be concatenated together to form the first name part of the name. Then, we will return the last name followed by the first name (with a space in between).
Don't panic -it really isn't that bad. Here is the code:

vLastName := "";
vFirstName := "";
s := @Explode(s; " ");
@For(i := 1; i <= @Elements(s); i := i + 1;

vWord := s[i];
@If(vLastName = "";

@If(@Matches(vWord; "{a-z}*");vLastName := vLastName + vWord + " "; vFirstName := vFirstName + vWord + " ");
vLastName := vLastName + vWord + " ")

);

@Trim(vLastName) + " " + @Trim(vFirstName)


There, that wasn't that bad, now was it? Ok, let's move on to the next condition.


#3 (No lowercase-beginning words found) -Ok, this is simple. We just use the same code chunk from the original solution: @If(@Contains(s; " "); @RightBack(s; " ") + " " + @LeftBack(s; " "); s));


OK, now we just combine our individual code chunks into a single statement. [HINT] We need to wrap the code for #2 inside an @Do() construct.
@If(@Matches(s; "* {a-z}*"); @Do( vLastName := ""; vFirstName := ""; s := @Explode(s; " "); @For(i := 1; i <= @Elements(s); i := i + 1; vWord := s[i]; @If(vLastName = ""; @If(@Matches(vWord; "{a-z}*"); vLastName := vLastName + vWord + " "; vFirstName := vFirstName + vWord + " "); vLastName := vLastName + vWord + " ")); @Trim(vLastName) + " " + @Trim(vFirstName)); @If(@Contains(s; " "); @RightBack(s; " ") + " " + @LeftBack(s; " "); s));


Ok, we have the code for a single element. Now all we need to do is throw it into an @Transform function, and we have solved Task #1. Here it is:
vRestructure1 := @Transform(Speaker; "s"; @If(@Matches(s; "* {a-z}*"); @Do( vLastName := ""; vFirstName := ""; s := @Explode(s; " "); @For(i := 1; i <= @Elements(s); i := i + 1; vWord := s[i]; @If(vLastName = ""; @If(@Matches(vWord; "{a-z}*"); vLastName := vLastName + vWord + " "; vFirstName := vFirstName + vWord + " "); vLastName := vLastName + vWord + " ")); @Trim(vLastName) + " " + @Trim(vFirstName));



   FEEL THE GEEK POWER!!!   


All right, let's finish this bad boy up.
#2 (sort the list) -This isn't that difficult at all, just a simple call: vSorted := @Sort(vRestructure1);

Ok, we've got the first two tasks taken care of, now all that is left is to:
#3 (restructure sorted list) - I realized that using an @Transform here isn't necessary. At this point in the code, we have 3 lists: Speaker, vRestructure1 and vSorted. All we need to do is run a simple @Replace here to get our names back to the original format.
vRestructure2 := @Replace(vSorted; vRestructure1; Speaker);


Well, that's pretty much it. I've put this code (as well as Ben's and Julian's solutions) into the SortDemo database (available in the downloads area) for y'all to play with.


Hope this helps!
-Devin


Gravatar Image6 -

Whoops! I forgot to mention that in the SortDemo database, there is also a demonstration of using the [CustomSort] option with a call to @Sort.


While this works, it is (as Ben had already suspected) very ineffecient when compared to the transform-sort-reconstruct methodology. (Download the demo to test it out)


-Devin.

Gravatar Image7 - Very nice, Devin. Very nice.

- Julian

Gravatar Image8 - muchos gracias, mi amigo.
-Devin.

Gravatar Image9 - I'm afraid I've unleashed a monster.

Search

Wowsers! A Tag Cloud!

Techie Stuff

Links

MiscLinks