Newsletter
First published: Oct. 7, 2008, 4:42 p.m. MDT
Last edited: Oct. 7, 2008, 9:18 p.m. MDT
Last edited: Oct. 7, 2008, 9:18 p.m. MDT
Primes
Update: oh my goodness. Had I just checked the Wikipedia's entry on Prime numbers, I would have been reminded of fundamental things like the Sieve of Eratosthenes (which I of course forgot the name of, but the idea is featured prominently in my parallel algorithm, mentioned at the end of this post) and newer improvements, along with a ton of other incredibly interesting facts, and even a prime distribution graph! Wikipedia: +1, JT: 0In my static analysis meeting, my professor noted that indexing objects with primes led to an easy set representation scheme, in which a set is simply the product of the prime indices of the objects in that set. I thought this was pretty fascinating. I started wondering about the constraints involved in implementing such a scheme, and inevitably this led to rather worthless homework procrastination that I thought I'd share here.
First, I was curious about prime number distribution over the integers. I made a graph.
So, how to read the graph. At a given point X on the X-axis, there are Y-many prime numbers within a diameter of 100,000 around X. In other words, it's a histogram of all primes below 1 billion, displayed with 10,000 buckets.
Second (but really a prerequisite to the first), I wrote a program to quickly generate as many prime numbers as possible.
- #include <iostream>
- #include <list>
- #include <math.h>
- using namespace std;
- // this program finds all primes <= 1 billion and notes if they are twin primes
- // or mersenne primes.
- // compile with (on JT's machine):
- // g++ -O3 -march=nocona -funroll-loops -fomit-frame-pointer -o primes primes.cpp
- inline bool is_mersenne(int val) {
- // so, ((x - 1) & x) is 0 iff x is a power of 2. See
- // http://www.cprogramming.com/tutorial/powtwosol.html for a good
- // explanation of why.
- // since a mersenne number should be of the form 2^n - 1,
- // (x & (x + 1)) is 0 iff x is of form 2^n - 1.
- return !(val & (val + 1));
- }
- int main(int argc, char** argv) {
- // strategy: as we calculate primes, we store previous primes so we only
- // have to check divisibility by known primes to know if a new number is
- // prime.
- list<int> primes;
- // largest_number is the largest number we will check for primeness
- int largest_number = 1000000000; // 1,000,000,000
- // we use last_prime to keep track of if the last number we checked was
- // prime. this way determining twin primes is easy.
- bool last_prime = true;
- // book keeping
- bool factor_found;
- int sqrt_i;
- // we start with 3, because 3 and every prime after is odd, so we can
- // increment by 2.
- primes.push_back(2);
- // we don't check divisibility by 1, since that's always true, but we output
- // it and 2 as primes.
- cout << 1 << endl << 2 << endl;
- // check all odd numbers starting with 3 until the largest number
- for(int i = 3; i <= largest_number; i += 2) {
- // no factor found yet.
- factor_found = false;
- // store the square root of the number we're checking. we only need to
- // check up to the square root.
- sqrt_i = sqrt(i);
- // iterate over previously calculated primes up to the square root of
- // the current number. note that we don't have to check if the iterator
- // is primes.end() because there is empirically always a prime stored
- // greater than the square root of the number we're checking. i wonder
- // if there's a proof there. it would probably be of the form of proving
- // that for a given integer i, a prime p exists such that i <= p < i^2
- for(list<int>::iterator it(primes.begin()); *it <= sqrt_i; ++it) {
- // if the current number mod the current prime is 0, we found a
- // factor. not prime, break out.
- if(i % *it == 0) {
- factor_found = true;
- break;
- }
- }
- if(!factor_found) {
- // no factor found! this is a prime. store it.
- primes.push_back(i);
- // output the prime
- cout << i;
- // is it a twin prime? (p, p+2)
- if(last_prime) cout << " *";
- // is it a mersenne prime? (2^n-1)
- if(is_mersenne(i)) cout << " +";
- cout << endl;
- // keep track of if the last number was prime.
- last_prime = true;
- } else {
- last_prime = false;
- }
- }
- return 0;
- }
Third, I think I've figured out a pretty clever algorithm for parallelizing the computation of primes. I was inspired by software pipelining. I hope to actually write up the algorithm to check its effectiveness. I'll post it here if/when I do. Stay tuned.