Monday, September 14, 2009

Segmented Sieve


Memory and time efficient :)



Problem Statement:


Your are given two integers a and b. You have to find all the primes within range a and b. Here, 1 ≤ a ≤ b ≤ 231-1 and b - a ≤ 105.

Note: You have to handle 1, 2 and even numbers for appropriate case of your own.

Solution:



#include <string.h>

#define MAX 46656
#define LMT 216
#define LEN 4830
#define RNG 100032

unsigned base[MAX/64], segment[RNG/64], primes[LEN];

#define sq(x) ((x)*(x))
#define mset(x,v) memset(x,v,sizeof(x))
#define chkC(x,n) (x[n>>6]&(1<<((n>>1)&31)))
#define setC(x,n) (x[n>>6]|=(1<<((n>>1)&31)))

/* Generates all the necessary prime numbers and marks them in base[]*/
void sieve()
{
unsigned i, j, k;
for(i=3; i<LMT; i+=2)
if(!chkC(base, i))
for(j=i*i, k=i<<1; j<MAX; j+=k)
setC(base, j);
for(i=3, j=0; i<MAX; i+=2)
if(!chkC(base, i))
primes[j++] = i;
}

/* Returns the prime-count within range [a,b] and marks them in segment[] */
int segmented_sieve(int a, int b)
{
unsigned i, j, k, cnt = (a<=2 && 2<=b)? 1 : 0;
if(b<2) return 0;
if(a<3) a = 3;
if(a%2==0) a++;
mset(segment,0);
for(i=0; sq(primes[i])<=b; i++)
{
j = primes[i] * ( (a+primes[i]-1) / primes[i] );
if(j%2==0) j += primes[i];
for(k=primes[i]<<1; j<=b; j+=k)
if(j!=primes[i])
setC(segment, (j-a));
}
for(i=0; i<=b-a; i+=2)
if(!chkC(segment, i))
cnt++;
return cnt;
}

This is a sample program which demonstrates segmented sieve. Very fast and memory efficient version. 'base' is the array which holds the flags for all the primes upto √(231-1), i.e. the square-root of the max limit, and all the primes are stored in the 'primes' array. Later, these primes are used to determine whether a number is a composite or not within a certain range in the segmented sieve. To avoid overflow and sign bit problems, unsigned type is used.

A little explanation:


First of what what these macros mean?
#define MAX 46656
#define LMT 216
#define LEN 4830
#define RNG 100032

MAX is the sqrt of maximum possible input, in case of here, the maximum is integer range sqrt of which is almost MAX used here. So, MAX is not maximum allowed input, it is just sqrt of maximum input which is pretty big as 2147483647 i.e. 32 bit signed integer maximum.
LMT is sqrt of MAX. We all know, we run sieve upto sqrt MAX
LEN is the maximum possible different primes that can be stored using this algorithm with specific range defined as RNG, on which the segmented sieve will run and collect the primes out of it.

Now the next two vital macros:
#define chkC(x,n) (x[n>>6]&(1<<((n>>1)&31)))
#define setC(x,n) (x[n>>6]|=(1<<((n>>1)&31)))
And why we divide by 64:

Ok, yes, it is clearly bit shifting. But you must know what we do in bitwise sieve first in order to get this. Instead of using a whole array position to store just one flag, we can use its each 32 bits to store one flag, which saves memory by a factor 1/32. So its very logical to capture memory upto MAX/32, but why MAX/64 and RNG/64 here?
Because, we really have no point of handling the even number as 2 is the only even prime and we can handle it manually, without any stress. So, if we do not consider the even numbers at all, the total numbers are again reduced by a factor 1/2, isn't it? So what we get total is MAX/32/2 = MAX/64, same for RNG.

Now, the two macros chkC and setC is pretty straight forward. chkC checks if a specific bitflag is 1 or 0, and setC sets a specific bitflag 1 to mark it as a composite. They work similarly, so I will explain only the chkC part.

In bitwise sieve, where is a specific value n located? Obviously (n/32)th index, and on that index, (n%32)th bit from LSB (right hand side). But we just said before, we are interested with only odd numbers, so we map n with n/2 as follows:

Actual numbers
1 2 3 4 5 6 7 8 9 ........... n ;[n is odd]
| | | | | | | | | ........... |
0 x 1 x 2 x 3 x 4 ........... (n/2)
Position on which they are represented in the bitstrings.

So, n is actually n/2, which implies the previous statement: n's (actually n/2 's) position is in index [n/2/32] = [n/64] = [n>>6] and on the bit position (n/2)%32 = (n>>1)&31 ;[ We know, modding with a power of 2 is same as ANDing with (same power of 2)-1 ]. The rest is, how we check / set this specific bit. I have another post explaining these operations: Bitwise operations in C: Part 3, and obviously you can google it :)

36 comments:

  1. Thanks! After reading this article I managed to get AC on SPOJ's PRIME1.

    ReplyDelete
  2. great code dude...
    but i have some problems understanding it....
    if you don't mind can you explain me.... some of the following things....
    i want help in understanding the
    #define chkC(x,n) (x[n>>6]&(1<<((n>>1)&31)))
    #define setC(x,n) (x[n>>6]|=(1<<((n>>1)&31)))

    i know they are using bitshift and operators..... but what exactly they are trying to do can
    somebody make me understand.... why is it needed to divide n by 64 and why are they ANDing (n/2) with 31..... and why MAX=46656 and RNG=100032 and LEN=4830....
    LMT is ok b'cos it's sqrt(MAX)

    thanx in advance....

    ReplyDelete
  3. @Arun, I have added the description for the macros. I think you already got it as 1 and a half months passed and I was really busy. :(

    ReplyDelete
  4. For anyone who is wondering what the macros are doing, shifting to right (c>>n) divides c by 2^n. So 32>>2 divides 32 by 2^2(=4), which results in 8. So, 32>>2 = 8. the bit shifting in those macros divides by 64 to make the address aligned with the word size of the machine(64-bits). The AND operator is a fast modulus operation for power of two numbers. As long as Y is some power of two(2^n, 2,4,8,16,32,64,128, etc...) X%Y can be written as X&Y-1. So, for example N%32 can be written as N&31. Wikipedia is a golden source of information, and google helps too.

    ReplyDelete
  5. How do you print out the primes in the segment? Can you also post the main function of the program pls?

    ReplyDelete
  6. @Anonymous, you see the last loop in segmented sieve function? That is counting number of primes in the interval [a, b] mapping a -> 0 and b -> b-a.
    So, when the if is true, (you were counting i) just print a + i.
    That should do, and play around it a bit more. As you can do it in a more compact way.

    ReplyDelete
  7. Yes, I saw (print a+i), but as you said I needed to put some extra checks. Thank you, after some more work I made it on PRIME1.

    ReplyDelete
  8. can u please explain the for loop in the segmented_sieve function??

    ReplyDelete
  9. Can u please explain the first for loop in the segmented_sieve function??

    ReplyDelete
  10. @Kumar, how do we check if a number is prime or not? isn't it sufficient to try to divide it primes below its square root?

    ReplyDelete
  11. Yes, got ACC on SPOJ: PRIME1, after minor tweaks on your solution posted above. ta!

    ReplyDelete
  12. Hey Zobayer,Thanks for such nice postings, I ve learned a lot from your blogs. but i stuck in the part of segmented_sieve()(line 36-46), I debugged your code but its very confusing to me and i am unable to understand it.Can you please explain that part of code.

    ReplyDelete
  13. Have you tried reading previous comments?

    ReplyDelete
  14. Yes,I ve read all previous comments,I ve understood all your code lines except lines from 36-46. :(

    ReplyDelete
  15. Well, I am telling you shortly, hope you will get that.

    Here, we are assuming that we will need to find primes within a range [a, b]. Where a and/or b can be pretty large to store in memory, but b-a is a sufficient number, say 10^5, so, we can store them in memory, now, we can map a to 0 and b to b - a, am I right? That's why, when we are marking j, we are marking j - a, which represent j actually.

    Now, when a number is a prime? Answer: when no primes under square root of that number divides it. So, I have all the primes below the square root of max, here say b. So, we will cut out the multiples of primes which lie in range of a to b. Now, what is the first number which is a multiple of some prime p greater than a? That should be our starting point. Now just keep cutting until you reach b.

    Hope it is clear.

    ReplyDelete
  16. Awesome explanation with code,nice work!!

    ReplyDelete
  17. I have two questions.

    1. How do you unmark a marked position in the sieve?

    2. How do you flip (xor with 1) a position in the sieve?

    I'm trying to implement sieve of atkin with bit array.

    ReplyDelete
    Replies
    1. 1. if you mean unmarking as clearing a bit, i.e. set 0 to that bit, then create a mask with a 1 on corresponding position, the flip the bit pattern, then AND it with the flag value;
      say, to set 4th lsb (0 based index) to 0:
      m = ~(1 << 4); b = b & m

      2. shift a 1 to corresponding position and then xor the mask and your flag value;
      m = 1 << 4;
      b = b ^ m;

      Delete
  18. Hi, men. You did really great job with post! specially with all about efficient bitwise operations.

    ReplyDelete
  19. How this code will be modified for b-a<=10^6

    ReplyDelete
    Replies
    1. calculate appropriate values for
      #define MAX 46656
      #define LMT 216
      #define LEN 4830
      #define RNG 100032
      these macros

      Delete
  20. plz tell how to make this work for m,n<=10^12 and m-n=10^6..plz.

    ReplyDelete
  21. How did you write this algorithm? lol.

    ReplyDelete
    Replies
    1. With my keyboard ofc... I don't know why would you think of anything else.

      Delete
  22. I have some doubt in #define setC(x,n) (x[n>>6]|=(1<<((n>>1)&31))) . I have understood how chkC(x,n) is working.
    In case of setC :
    Suppose we have to set 4 rth bit as 1.
    Then we do 1<<4 .It means we are changing 1 to 1000 and assigning it to x[n>>6]. But if third bit of this index was 1 previously ;won't it now change to 0?& Then we should not get the desired result..Please explain me where I am wrong.
    Thanks in advance.

    ReplyDelete
    Replies
    1. Hello there,
      First of all, 1 << 4 will make 5th bit 1. Secondly, if you notice, we are doing an bitwise OR operation. x | 0 = x , x | 1 = 1. So, other bits will remain the same, only the one position that has a 1 in the mask will be set in x[n>>6].
      chk and set macros are same, one checks if a specific bit is set using & (but it does not modify the actual value) where set macro uses |= which is the short form of OR and then assign, modifies the actual value.
      Let me know if this is clear enough.

      Delete
  23. will this same code work for larger b and b-a, if I change the value of MAX,LMT,LEN and RNG as required ?

    ReplyDelete
  24. Hi Zobayer :)
    You blog is just awesome.

    But here I suspect a bug inside your code. Consider the following-
    1. Suppose MAX is 63 then, 63/64 = 0 is the array size, which is not acceptable i think
    2. if MAX is 64 then, 64/64 = 1 is the array size. But to mark the primality of 64 we need to access index base[1], but its not possible with array size 1

    It should be base[MAX / 64 + 1] i think?
    Am I wrong?

    Thanks for your nice writing. Keep up good work (y)

    ReplyDelete
    Replies
    1. You are absolutely correct. And it is always safe to have some more free indices in arrays.
      And thanks for pointing it out. :)

      Delete
  25. This code worked for me..

    #include
    #include
    using namespace std;

    int main() {
    // your code goes here

    int t,a,b,flag,j,i;
    cin>>t;
    while(t!=0)
    {
    cin>>a>>b;
    for(i=a;i<=b;i++)
    {
    flag = 0;
    for(j=2;j<=sqrt(i);j++) // code to test, if i is a prime number or not
    {
    if(i%j == 0) {
    flag =1;
    break;
    }
    }
    if(flag == 0 && i != 1) cout<<i<<endl;
    flag = 0;
    }
    t--;
    cout<<endl;
    }


    return 0;
    }

    ReplyDelete
    Replies
    1. Maybe, but its a bad code from so many aspects. For example you wont call sqrt in a loop....

      Delete
  26. Hi! I was wondering how you calculated value of LEN.

    ReplyDelete
    Replies
    1. Set it to a larger value and then see how many primes actually are there. And then modify the length after that.

      Delete
  27. segmented_sieve(1,2)
    gives runtime error

    ReplyDelete