By Peter F. Whyte, 5 Feb 2009
Original article: REALbasic University Lesson 16: Count Emails (30 May 2001)
Indented quoted sections are taken from the original article.
The second program I wanted to demonstrate is one I wrote one day while studying my Stone Table Software customer list. I was looking at all the emails — many from foreign countries — and I started wondering what percentage of my customers come from which countries. Email addresses and domains aren’t a surefire way to determine a person’s origin, but they can tell you a little.
So I sat down and wrote this little program which accepts a text file of emails (one email per line) and sorts and counts them by extension (.com, .net, etc.).
I’ve updated the program for the version of REALbasic noted about, and altered it to open a file and display a count of the total e-mails in the list. Drag and drop doesn’t seem to work on Windows.
To start, create a new project in REALbasic. Using the default Window1 that’s created for you, drag a listbox onto it. Make the listbox big so it fills all of Window1.
For emaillist’s properties, check the HasHeading option and put “3” for the number of columns. For columnWidths, put “50%, 25%, 25%”. (That will divide emaillist into three columns, the first at a width of 50% and the others at 25%.) Your properties window should look like this:
Double-click emaillist to open the Code Editor.
Go to emaillist’s Open event. Here we’re going to put in the following:
Source code for Window1.emaillist.Open:me.Heading(-1)="Domain" +chr(9) + "Quantity" + chr(9) + "%"
This names the listbox’s headers.
Now go to the DropObject event. This is a major part of the program. What happens is the user drops a text file on the control, so first we check to make sure they dropped a file and then we open it. We load up theList array with each line from the text file, then close it. We call a few other routines — stripList, sortIt, countDuplicates, and updateList — and then we’re done.
Source code for Window1.emaillist.DropObject:dim in as textInputStream if obj.folderItemAvailable then if obj.folderItem <> nil then in = obj.folderItem.openAsTextFile if in <> nil then // Erase the arrays redim theList(0) redim itemCount(0) do theList.append lowercase(in.readLine) loop until in.EOF in.close stripList countDuplicates updateList me.headingIndex = 1 // sorts by quantity end if end if end if
Since Drag and Drop does not seem to work on Windows, I’ve added a PushButton to enable a file to be opened. The code for DropObject has been moved to its Action event and modified as follows.
Source code for Window1.openfile.Action:
dim f as FolderItem dim s as TextInputStream f=GetOpenFolderItem("*.txt") if f<>nil then // erase the arrays ReDim theList(0) ReDim itemCount(0) s=f.OpenAsTextFile while not s.EOF theList.Append Lowercase(s.ReadLine) wend s.Close stripList countDuplicates updateList emaillist.HeadingIndex=1 // sorts by quantity end if
We need a file (FolderItem) and a text stream (TextInputStream).
will open a file, and filter the file list to text files with “.txt” extension.
The loop works the same way as in the original, but on the file just opened.
The HeadingIndex for sorting just needs to reference the correct control now that it has been moved from the listbox.
We’re going to need to create all those methods now, so let’s get started. None of the methods have any parameters or return any values, so they’re simple. I find it easier to create them all first, then go back and fill in the details later. So go to the Edit menu and choose “New Method” a bunch of times and use the following names for the methods:stripList countDuplicates updateList
When you’ve got all three methods ready, let’s put in the code, starting with stripList. It’s a routine that deletes everything but the email extension from the email addresses. The algorithm used is a simple one: it simply looks for the last phrase deliminated by a period. Since all email address end in a dot-something, the text to the right of the dot is saved and everything else is thrown away (stripped). This way our list of emails is paired down to simple “com”, “net”, and other extensions.
If other words, this:
email@example.com firstname.lastname@example.org email@example.com firstname.lastname@example.org
uk com ch com
Source code for Window1.stripList:dim n, i as integer n = uBound(theList) for i = 1 to n theList(i) = nthField(theList(i), ".", countFields(theList(i), ".")) next
How does the above work? Well, the key line is the code in the middle of the for-next loop. To understand it better, let’s mentally run it the way the computer would, using a sample address.
Let’s say we have an address line “email@example.com” (theList(i) is equal to “firstname.lastname@example.org”). That means the key line in the above is really saying this:theList(i) = nthField("email@example.com", ".", countFields("firstname.lastname@example.org", "."))
Process the end of it first. The code countFields("email@example.com", ".")) will return a number, the number of fields in the phrase delimited by a period. Since there is only one period in the phrase, there are therefore two fields (“moron@earthlink” and “net”, everything to the left and right of the period). So replace the countFields code with the number 2 (the result).theList(i) = nthField("firstname.lastname@example.org", ".", 2)
So now our code is effectively saying, “Put field number 2 (the last) of ‘email@example.com’ into theList(i).” Since the second field is “net” we have effectively changed “firstname.lastname@example.org” to “net”!
CountDuplicates is a little more complicated. It loops through the entire list and counts how many of each kind exist. If it finds a duplicate, it does two things: it deletes the duplicate from theList and increments the number stored in that item’s count. (The item’s count is stored in the itemCount array.)
Note how we don’t use a for loop for the outermost loop; I used a while loop instead. I did this because the size of the loop changes: when we find a duplicate item we delete it, reducing the number of items in the list. Using a while loop means that the computer will keep repeating through the list until it reaches the last unique item.
Source code for Window1.countDuplicates:dim total, n, i, j, found as integer i = 1 while i < uBound(theList) redim itemCount(uBound(theList)) found = 0 n = uBound(theList) for j = 1 to n // Check through if theList(i) = theList(j) and i <> j then found = j end if next // j if found > 0 then itemCount(i) = itemCount(i) + 1 theList.remove found else i = i + 1 end if wend // Figure out percentages n = uBound(itemCount) total = 0 for i = 1 to n total = total + itemCount(i) + 1 next // i redim percentList(n) for i = 1 to n percentList(i) = ((itemCount(i) + 1) / total) next // i // Report total totalCount.Text= "Total: " + str(total)
The final part of CountDuplicates is where we calculate the percentage of each kind of email. To do that we first need a total: not the number of elements in the list, but the total number of emails. So we count through the itemCount array, adding up each item. Since an itemCount element contains a zero if there’s only one of that item, we add a one to it to get the actual count.
We finish by initializing the percentList array to the size of our list, then set each element in the array to the appropriate percentage. Note that we again add one to the itemCount value.
I’ve added a line to report the total items in the list to a StaticText field called totalCount.
Our final method for the program is the updateList method. This is used to display the information in emaillist. It first deletes any existing rows in emaillist (if you didn’t do that, dropping a second file on the existing listbox would display the wrong content), then counts through the list of items.
For each item in the list, it puts the content in the appropriate column of the listbox. The first item is in the addRow command: that’s the “.com” or whatever was part of the email address. Once we’ve added a row, we don’t want to add another, we want to work with the row we just added. So we use the lastIndex property which contains the index number of the last row we added. We use the cell method to put the actual item count (again, we add one to it) and the percentage into the correct columns.
Source code for Window1.updateList:dim n, i as integer emaillist.deleteAllRows n = uBound(theList) for i = 1 to n emaillist.addRow theList(i) emaillist.cell(emaillist.lastIndex, 1) = format(itemCount(i) + 1, "000") emaillist.cell(emaillist.lastIndex, 2) = format(percentList(i), "00.#%") next
Whew! Our program’s almost done: we just need to add our arrays as properties to Window1. So with the Window1 code editor window visible, go to the Edit menu and choose “New Property”. You’ll do this three times, adding the following properties:itemCount(0) as integer percentList(0) as double theList(0) as string
Save your program and run it. It should display a simple window with a listbox in it.
Oh no! You need some email addresses to test this, don’t you. Okay, here’s a list of hopefully fictitious addresses I made up. (I tried to use a variety of extensions.) Save them to a text file and you’ll be able to drag the text file to the listbox. (Option-click on this link to save the text file to your hard drive.)
email@example.com firstname.lastname@example.org email@example.com firstname.lastname@example.org email@example.com firstname.lastname@example.org email@example.com firstname.lastname@example.org email@example.com firstname.lastname@example.org email@example.com firstname.lastname@example.org email@example.com firstname.lastname@example.org email@example.com firstname.lastname@example.org email@example.com firstname.lastname@example.org email@example.com firstname.lastname@example.org email@example.com firstname.lastname@example.org email@example.com firstname.lastname@example.org
After you drag the file to the listbox, your window should look similar to this:
The revised Windows program will look like this after the test file has been loaded:
Another great interface tool of modern computing is the ContextualMenu. These are hidden menus that popup when a user clicks the mouse while holding down the Control key. Ideally they live up to their name and are actually contextual: that is, they change depending on the user’s current situation.