Stretched Out Software, Inc.
ServicesPortfolioProducts
NewsHistoryContactTech Articles

Using Prime Numbers for Category Exclusivity
by Greg O'Lone, Stretched Out Software Inc.
Updated March 2005

Background:

We had a customer not too long ago who wanted to create a vendor directory on his web site. He wanted the standard stuff, of course (Name, Address, Tel, URL, etc) and he also wanted to be able to choose a category (as with a telephone directory, for example) for each of the vendors. The catch was this: he wanted the ability to file any vendor in multiple categories, and he wanted the ability to add/edit/remove categories at will, without needing to call someone to do it for him.

The Problem:

  • How many categories will the customer want to add?
  • How many simultaneous categories will the customer want to select?
  • Keep the storage requirements small

Our solution:

After looking at several options (including just having a field which listed the categories), we settled on a system using prime numbers.

A prime number is a positive integer which is divisible only by one and itself. e.g. 2, 3, 5, 7 and 11 are all prime. 6 is not prime because it is also divisible by 2 and 3 (which, by the way ARE prime).
There is a certain amount of magic behind the prime numbers, because when multiplying two primes together, you will never get another prime. Like the theory behind the autonumber fields, you will never get two alike...even by multiplying them together.

Here's approximately how it goes...

Make a table to hold the categories with three fields:

1 CategoryID Autonumber
2 CatPrime Integer
3 CatName String

Each item in this table would be assigned the next consecutive available prime number starting with 2.

CategoryID CatPrime CatName
1 2 Architects
2 3 Banquets
3 5 Carpets
4 7 Dentists
5 11 Employment
etc...    

When refering to a category in this list, use the CatPrime instead of the autonumber and multiply them together. If you wanted to file something under Carpets and Employment, you would use a value of 55 (5 * 11). Architects, Banquets and Dentists would be 42 (2*3*7).

Make a table to hold the vendor information:

  Field Type   Example
1 VendorID Autonumber   1
2 Name String   Joe's Carpets
3 Address String   123 Main Street
. ... ...   ...
n CatValue Long Integer or Double Float   55

Now here's the real magic of this system...to find out if one of the vendors is in a specific category, you would compare the vendor's CatValue (55) against an individual category (lets say Carpets, which has a value of 5) using the modulo operator. (The modulo operator detects whether a number is evenly divisible by a specified value, and provides what is known in mathematics as a "remainder".) If you get a result of zero, the vendor would be in this category. If you get anything else, they're not. 55 modulo 5 equals zero. If we try this same check with another category like Dentists (7), 55 modulo 7 equals 6.

The modulo operator returns the remainder when the first operand is divided by the second. e.g. 5 modulo 2 = 1
In Javascript and MSSQL the operator is %.

In VBScript you would use mod.

Note that the modulo operator is available in an SQL statement. You can use it to filter data just like any other operator.

There ARE some limitations to this theory:

If you decide to work with an unsigned long integer (max 4294967295) for the CatValue, and the customer wishes to select all of the categories, you are limited to 9 categories because 2*3*5*7*11*13*17*19*23*25 overflows the long integer. You can however use a double floating point number (max ????) to extend your range a bit...to a maximum of 131 categories. It is important to remember that the overflow values vary between operating systems, and you should check this limit before using this theory. Another thing to remember is that if you set the CatValue to 1, the modulo operation will always return 1 and CatValue of 0 will always return 0.

How do you find the next prime number in a sequence? Glad you asked. To speed things up a bit, we decided to pass the last maximum prime number in use to the function, but you'll get the idea:

Javascript

VBScript

Conclusion:

We've used this technique in several capacities including resource directories (like the example above), custom computer system builders (matching motherboards with CPU's & Hard Drives), Mailing list categories, etc.

March 2005 - UPDATE: Another project came up which required us to fine-tune the routines to find the next prime number. I made an assummption above which I shouldn't have, but the fix makes the routine even more efficient!

If you look at the lines where the for-next loops are defined, the upper limit should be numberToCheck/i, and we should only bother checking prime divisors! Here's the theory:

Lets say we are checking to see if 65 is prime (which it is not, but any number will do), lets see how the new algorithm would check the number...
     
i
numberToCheck/i
numberToCheck mod i
2
32
1
3
21.6667
2
5
13
0

This number is not prime because 65 / 5 = 13, but our old system would have told us this in 32 steps instead of 3, which for the most part is not a really big deal... now lets check a bigger number that is prime... 997
Old System New System
i
997 / 2
997 mod i
2
498.5
1
3
498.5
1
4
498.5
1
5
498.5
2
6
498.5
1
7
498.5
3
8
498.5
5
9
498.5
7
.
.
.
.
.
.
496
498.5
5
497
498.5
3
498
498.5
1
i
997 / i
997 mod i
2 498.5 1
3 332.333 1
5 199.4 2
7 142.429 3
11 90.636 7
13 76.692 9
17 58.647 11
19 52.474 9
23 43.348 8
29 34.379 11
31 32.161 5
-------- --------------- ----------------
33 30.212 7
The old system would have gone through 498 steps just to find out that the number is prime!

The new one only required 11 steps†! You'll notice that I've continued one step further. What I wanted you to see is the 997 / i column. Once the results of the division start dipping below the divisor, we can stop because those numbers have already been checked!

†Ok, granted, the first time through DOES require 32 steps, but once we've figured out which numbers are prime, we should store them for later use and hey, 31 is still better than 498, right?

So how do we impliment this?

First, I suggest defining a global array of integers to store numbers that you've already checked. I've even been known to write the results to a file so I can reuse the resuts later and then loading them when I need them.

So lets look at the new code (as of today, I've only coded this in Basic, but you'll get the idea):

I've defined primes as a global array of integers

dim primes() as integer

-------------------------------------------------------------------

function isPrime(n as integer) as boolean
'Checks to see if n is prime

dim p,j as integer
dim tf as boolean

'Check the global array first
p = primes.IndexOf(n) 'Check to see if n is already in primes (do not use indexOf in Javascript!)
if p>-1 then
'If we found it, return true
return true
else
'Otherwise we've got to check

tf = true 'Assume that it is prime
for j = 2 to n/j
if isPrime(j) then 'Check to see if the divisor is prime
if n mod j = 0 then 'Check to see if the number is evenly divisible by the current prime
tf = false 'The number is not prime
exit
end if
end if
next j
'If the number is prime, add it to the global array so we can use it later
if tf then primes.append(n)
end if
return tf
end function

-------------------------------------------------------------------



Home - Services - Portfolio - History - Contact - Tech Articles --- Legal Notices - E-Pay Invoices