Newsletter

First published: Oct. 7, 2008, 4:42 p.m. MDT
Last edited: Oct. 7, 2008, 9:18 p.m. MDT

Primes

Update: oh my goodness. Had I just checked the Wikipedia's entry on Prime numbers, I would have been reminded of fundamental things like the Sieve of Eratosthenes (which I of course forgot the name of, but the idea is featured prominently in my parallel algorithm, mentioned at the end of this post) and newer improvements, along with a ton of other incredibly interesting facts, and even a prime distribution graph! Wikipedia: +1, JT: 0

In my static analysis meeting, my professor noted that indexing objects with primes led to an easy set representation scheme, in which a set is simply the product of the prime indices of the objects in that set. I thought this was pretty fascinating. I started wondering about the constraints involved in implementing such a scheme, and inevitably this led to rather worthless homework procrastination that I thought I'd share here.

First, I was curious about prime number distribution over the integers. I made a graph.
prime number distribution
So, how to read the graph. At a given point X on the X-axis, there are Y-many prime numbers within a diameter of 100,000 around X. In other words, it's a histogram of all primes below 1 billion, displayed with 10,000 buckets.

Second (but really a prerequisite to the first), I wrote a program to quickly generate as many prime numbers as possible.
  1. #include <iostream>
  2. #include <list>
  3. #include <math.h>
  4.  
  5. using namespace std;
  6.  
  7. // this program finds all primes <= 1 billion and notes if they are twin primes
  8. // or mersenne primes.
  9. // compile with (on JT's machine):
  10. // g++ -O3 -march=nocona -funroll-loops -fomit-frame-pointer -o primes primes.cpp
  11.  
  12. inline bool is_mersenne(int val) {
  13.     // so, ((x - 1) & x) is 0 iff x is a power of 2. See
  14.     // http://www.cprogramming.com/tutorial/powtwosol.html for a good
  15.     // explanation of why.
  16.     // since a mersenne number should be of the form 2^n - 1,
  17.     // (x & (x + 1)) is 0 iff x is of form 2^n - 1.
  18.     return !(val & (val + 1));
  19. }
  20.  
  21. int main(int argc, char** argv) {
  22.  
  23.     // strategy: as we calculate primes, we store previous primes so we only
  24.     // have to check divisibility by known primes to know if a new number is
  25.     // prime.
  26.     list<int> primes;
  27.  
  28.     // largest_number is the largest number we will check for primeness
  29.     int largest_number = 1000000000; // 1,000,000,000
  30.  
  31.     // we use last_prime to keep track of if the last number we checked was
  32.     // prime. this way determining twin primes is easy.
  33.     bool last_prime = true;
  34.  
  35.     // book keeping
  36.     bool factor_found;
  37.     int sqrt_i;
  38.  
  39.     // we start with 3, because 3 and every prime after is odd, so we can
  40.     // increment by 2.
  41.     primes.push_back(2);
  42.  
  43.     // we don't check divisibility by 1, since that's always true, but we output
  44.     // it and 2 as primes.
  45.     cout << 1 << endl << 2 << endl;
  46.  
  47.     // check all odd numbers starting with 3 until the largest number
  48.     for(int i = 3; i <= largest_number; i += 2) {
  49.  
  50.         // no factor found yet.
  51.         factor_found = false;
  52.  
  53.         // store the square root of the number we're checking. we only need to
  54.         // check up to the square root.
  55.         sqrt_i = sqrt(i);
  56.  
  57.         // iterate over previously calculated primes up to the square root of
  58.         // the current number. note that we don't have to check if the iterator
  59.         // is primes.end() because there is empirically always a prime stored
  60.         // greater than the square root of the number we're checking. i wonder
  61.         // if there's a proof there. it would probably be of the form of proving
  62.         // that for a given integer i, a prime p exists such that i <= p < i^2
  63.         for(list<int>::iterator it(primes.begin()); *it <= sqrt_i; ++it) {
  64.  
  65.             // if the current number mod the current prime is 0, we found a
  66.             // factor. not prime, break out.
  67.             if(i % *it == 0) {
  68.                 factor_found = true;
  69.                 break;
  70.             }
  71.  
  72.         }
  73.  
  74.         if(!factor_found) {
  75.             // no factor found! this is a prime. store it.
  76.             primes.push_back(i);
  77.  
  78.             // output the prime
  79.             cout << i;
  80.  
  81.             // is it a twin prime? (p, p+2)
  82.             if(last_prime) cout << " *";
  83.  
  84.             // is it a mersenne prime? (2^n-1)
  85.             if(is_mersenne(i)) cout << " +";
  86.  
  87.             cout << endl;
  88.  
  89.         // keep track of if the last number was prime.
  90.             last_prime = true;
  91.         } else {
  92.             last_prime = false;
  93.         }
  94.  
  95.     }
  96.  
  97.     return 0;
  98. }

Third, I think I've figured out a pretty clever algorithm for parallelizing the computation of primes. I was inspired by software pipelining. I hope to actually write up the algorithm to check its effectiveness. I'll post it here if/when I do. Stay tuned.