I, ME AND MYSELF !!!: graph

Showing posts with label graph. Show all posts

Wednesday, January 1, 2014

SPOJ PRIMIT

Problem link: SPOJ Problem Set (classical): 211. Primitivus recurencis

Given a list of directed edges or features, we have to find the smallest possible sequence of nodes where all the given edges are present without violating the direction.

Although at first this may seem to be a bipartite matching problem or problem related to strongly connected component. But I am not sure whether this particular problem can be solved using any of those techniques. Here we have two things to observe.

1 ⇒ Ordering for each weakly connected component is independent. So, we can add up the results for each weakly connected component as we can arrange them in any order, because they don't have any node in common.

2 ⇒ The minimum length will be the length of eulerian circuit for each component. We can break the circuit at suitable places to make it a chain. Considering [1], we can break the chain anywhere we want. Because other components have nothing to do with it.

So, for each weakly connected component, it boils down to finding minimum number of edges need to be added to make the component an euler circuit. For each component G_i, we can find the minimum number of additional edges by the following formula (Note, the formula will also handle the components which are already an euler circuit).

Number of additional edges = ∑max( cnt(G_i) / 2, 1 ) for all i

Note: if you already have an euler circuit, i.e. cnt(G_i) = 0, then you still need 1 additional term in the sequence. The simplest example is two edges, 1-2 and 2-1. You will need either 1-2-1, or 2-1-2 sequence.

And to find the value of cnt() for a component G_i, we just add the absolute difference of in and out degree count for each node in that component.

cnt(G_i) = ∑|d⁺(u) - d^-(u)| for each u ∈ G_i
Here, d⁺(u) ⇒ indegree count for u, and d^-(u) ⇒ outdegree count for u.

So, the overall solution outline is something like this, first use disjoint set data structure, or commonly known as union-find to mark the components. You can also use simple DFS / BFS as well. Ofcourse we are ignoring direction in this stage as we want to generate weakly connected components. Then for each node, calculate total indegree and outdegree counts. Now for each component, use the above formula to find additional edges x_add, thus the final answer is N + x_add, here x is the given number of edges.

Friday, December 21, 2012

USACO 3.3 fence

Section 3.3: Riding the Fences

This is just another simple Euler path problem. Given a undirected unweighted graph, which contains multiple edges, you have to find the lexicographically smallest Euler path.

I don't know what the problem setter was thinking, but he made a big mistake. Problem statement says "Your program must output the path of intersections that, if interpreted as a base 500 number, would have the smallest magnitude." where nodes are numbered from 1 to 500. Clearly, 500 is not a base 500 digit. I made several wrong submissions for trying to do some foolish tricks to get this done, then I just did a lexicographic ordering, and it passed. Pretty funny!

Problem statement ensures that there is no input for which Euler path does not exist. This makes the solution easier, just check if all the degrees are even, if so, start from the smallest numbered node, otherwise start from smaller numbered odd degree node.

algorithm for finding euler path:
graph[u][v] contains number of edges between u and v

find( node u ):
    for each adjacent v of u in lexicographical order:
        while( graph[u][v] > 0 ):
            graph[u][v]--;
            graph[v][u]--;
            find(v);
    stack.push(u);

Now the stack contains the euler path in reverse order, so just keep popping!

Friday, June 4, 2010

Euler Tour

A famous drawing problem for kids, "You are given a picture, can you draw it without lifting up your pen?"... Well, in graph theory, we can determine this by checking the Euler Tour in the graph.

Actually this is about Euler Circuits and Eluer Paths. We know the conditions for an undirected graph, and we can extend it for directed graphs as well.

Undirected Graph

An undirected graph will have Eulerian tour ( either path or circuit ) if the following conditions hold:

The graph is connected.
Each node has even degree, or exactly two nodes have an odd degree.

Directed Graph

A directed graph will have Eulerian tour ( either path or circuit ) if the following conditions hold:

The undirected representation of the graph is connected.
The difference between indegree and outdegree of each node is at most 1.
Each node has equal indegree and outdegree, or, there is exactly two nodes which has different indegree and outdegree, and exactly one of them has indegree - outdegree = 1 and the other has outdegree - indegree = 1.

So, we can do this easily with the help of a bfs subroutine.
You might also like to read this and this.

Friday, May 28, 2010

MaxFlow :: Dinitz Algorithm

Here is a nice implementation of Dinitz blocking flow algorithm in C++ (with special thanks to Fahim vai). Works in undirected large graph containing multiple edges and self loops as well. No STL used. This implementation is pretty fast.

Here, input is, number of nodes 2≤n≤5000, number of input edges 0≤e≤30000, then e undirected edges in the form (u, v, cap) (1≤u,v≤n and 1≤cap≤10^9). Source and Sink are assumed 1 and n accordingly, can be changed in the init() function call.


#define SET(p) memset(p, -1, sizeof(p))
#define CLR(p) memset(p, 0, sizeof(p))
#define i64 long long

const int INF = 0x7fffffff;
const int MAXN = 5005, MAXE = 60006;

int src, snk, nNode, nEdge;
int Q[MAXN], fin[MAXN], pro[MAXN], dist[MAXN];
int flow[MAXE], cap[MAXE], next[MAXE], to[MAXE];

inline void init(int _src, int _snk, int _n) {
    src = _src, snk = _snk, nNode = _n, nEdge = 0;
    SET(fin);
}

inline void add(int u, int v, int c) {
    to[nEdge] = v, cap[nEdge] = c, flow[nEdge] = 0, next[nEdge] = fin[u], fin[u] = nEdge++;
    to[nEdge] = u, cap[nEdge] = c, flow[nEdge] = 0, next[nEdge] = fin[v], fin[v] = nEdge++;
}

bool bfs() {
    int st, en, i, u, v;
    SET(dist);
    dist[src] = st = en = 0;
    Q[en++] = src;
    while(st < en) {
        u = Q[st++];
        for(i=fin[u]; i>=0; i=next[i]) {
            v = to[i];
            if(flow[i] < cap[i] && dist[v]==-1) {
                dist[v] = dist[u]+1;
                Q[en++] = v;
            }
        }
    }
    return dist[snk]!=-1;
}

int dfs(int u, int fl) {
    if(u==snk) return fl;
    for(int &e=pro[u], v, df; e>=0; e=next[e]) {
        v = to[e];
        if(flow[e] < cap[e] && dist[v]==dist[u]+1) {
            df = dfs(v, min(cap[e]-flow[e], fl));
            if(df>0) {
                flow[e] += df;
                flow[e^1] -= df;
                return df;
            }
        }
    }
    return 0;
}

i64 dinitz() {
    i64 ret = 0;
    int df;
    while(bfs()) {
        for(int i=1; i<=nNode; i++) pro[i] = fin[i];
        while(true) {
            df = dfs(src, INF);
            if(df) ret += (i64)df;
            else break;
        }
    }
    return ret;
}

int main() {
    int n, e, u, v, c, i;
    scanf("%d%d", &n, &e);
    init(1, n, n);
    for(i=0; i<e; i++) {
        scanf("%d%d%d", &u, &v, &c);
        if(u!=v) add(u, v, c);
    }
    printf("%lld\n", dinitz());
    return 0;
}

Using adjacency matrix and/or STL makes it 10 to 4 times slower.

Saturday, May 22, 2010

Maximum Matching (DFS)

Maximum bipartite matching in a bipartite graph

Although Hopcroft Karp is faster and smarter, this one is pretty simple to code specially in contest time and when the graph is relatively smaller. It uses a DFS subroutine to cut and establish matching and thus produces a maximum matching. This version of BPM takes the adjacency list of left side of the bipartite graph and updates the Left[] and Right[] arrays with their respective matches.

Here, in the DFS subroutine, there are two for loops, where, the first one checks for yet unestablished connections, and the second one is the recursive DFS step. These two steps could be written in a single loop and the condition modified as "( Right[v]==-1 || dfs(Right[v]) )", actually separating them increases performance by some factors, because, first time it checks all the unmatched nodes before going into DFS.

A sample C++ implementation:


#define SET(x) memset(x, -1, sizeof(x))
#define CLR(x) memset(x, 0, sizeof(x))
#define MAX 100

vector < int > edges[MAX];
bool visited[MAX];
int Left[MAX], Right[MAX];

bool dfs(int u) {
    if(visited[u]) return false;
    visited[u] = true;
    int len = edges[u].size(), i, v;
    for(i=0; i<len; i++) {
        v = edges[u][i];
        if(Right[v]==-1) {
            Right[v] = u, Left[u] = v;
            return true;
        }
    }
    for(i=0; i<len; i++) {
        v = edges[u][i];
        if(dfs(Right[v])) {
            Right[v] = u, Left[u] = v;
            return true;
        }
    }
    return false;
}

int match() {
    SET(Left);
    SET(Right);
    int i, ret = 0;
    bool done;
    do {
        done = true;
        CLR(visited);
        for(i=0; i<MAX; i++) {
            if(Left[i]==-1 && dfs(i)) {
                done = false;
            }
        }
    } while(!done);
    for(i=0; i<MAX; i++) ret += (Left[i]!=-1);
    return ret;
}

Notes: Here, edges[MAX] is the left side adjacency list, implemented with vector, Left[MAX] and Right[MAX] holds the matching and the procedure match() returns maximum matching. Pretty straight forward.

Friday, May 7, 2010

Maximum Matching (Hopcroft)

Hopcroft-Karp is one of the fastest algorithm that finds the maximum cardinality matching on a bipartite graph. It has the best known worst case time complexity. More details can be found here [courtesy of Wikipedia].

C++ Source Code:


#define MAX 100001
#define NIL 0
#define INF (1<<28)

vector< int > G[MAX];
int n, m, match[MAX], dist[MAX];
// n: number of nodes on left side, nodes are numbered 1 to n
// m: number of nodes on right side, nodes are numbered n+1 to n+m
// G = NIL[0] ∪ G1[G[1---n]] ∪ G2[G[n+1---n+m]]

bool bfs() {
    int i, u, v, len;
    queue< int > Q;
    for(i=1; i<=n; i++) {
        if(match[i]==NIL) {
            dist[i] = 0;
            Q.push(i);
        }
        else dist[i] = INF;
    }
    dist[NIL] = INF;
    while(!Q.empty()) {
        u = Q.front(); Q.pop();
        if(u!=NIL) {
            len = G[u].size();
            for(i=0; i<len; i++) {
                v = G[u][i];
                if(dist[match[v]]==INF) {
                    dist[match[v]] = dist[u] + 1;
                    Q.push(match[v]);
                }
            }
        }
    }
    return (dist[NIL]!=INF);
}

bool dfs(int u) {
    int i, v, len;
    if(u!=NIL) {
        len = G[u].size();
        for(i=0; i<len; i++) {
            v = G[u][i];
            if(dist[match[v]]==dist[u]+1) {
                if(dfs(match[v])) {
                    match[v] = u;
                    match[u] = v;
                    return true;
                }
            }
        }
        dist[u] = INF;
        return false;
    }
    return true;
}

int hopcroft_karp() {
    int matching = 0, i;
    // match[] is assumed NIL for all vertex in G
    while(bfs())
        for(i=1; i<=n; i++)
            if(match[i]==NIL && dfs(i))
                matching++;
    return matching;
}

The implementation is quite straight forward as the algorithm on Wikipedia page. I am looking for some optimizations.

Monday, March 15, 2010

Testing Bipartite Graph

We can use BFS to check whether an undirected graph is bipartite or not, with only a little modification...

We define bipartite graph as follows: A bipartite graph is an undirected graph G = (V, E) in which V can be partitioned into two sets V1 and V2 such that (u, v) ∈ E implies either u ∈ V1 and v ∈ V2 or u ∈ V2 and v ∈ V1. That is, all edges go between the two sets V1 and V2.

In other to determine if a graph G = (V, E) is bipartite, we perform a BFS on it with a little modification such that whenever the BFS is at a vertex u and encounters a vertex v that is already 'gray', i.e. visited, our modified BSF should check to see if the depth of both u and v are even, or if they are both odd. If either of these conditions holds which implies d[u] and d[v] have the same parity, then the graph is not bipartite. Note that this modification does not change the running time of BFS and remains Θ(V + E).

Here goes a simple implementation of the above idea. We'll just run a BFS, while tracking the partition in which every node falls. If, when visiting node v from node u, and they are members of same partition, then this graph is not bipartite. Well, this time, I think, we have nothing to do with the edge weights or costs, so it will also work on a weighted graph.

Formally, to check if the given graph is bipartite, the algorithm traverse the graph labelling the vertices 0 or 1 / 2 , corresponding to unvisited or visited, and partition 1 or partition 2 depending on which set the nodes are in. If an edge is detected between two vertices in the same partition, the algorithm returns false immediately.

Note: In this implementation, only one test at a time is possible, to run it for multiple cases, definitely cares must be taken about resetting the arrays and other values as necessary.

#include <cstdio>
#include <vector>
#include <queue>
using namespace std;

#define MAX 1001

int n, e;
int partition[MAX], visited[MAX];
vector< int > G[MAX];

bool is_bipartite() {
    int i, u, v, start;
    queue< int > Q;
    start = 1; // nodes labeled from 1
    Q.push(start);
    partition[start] = 1; // 1 left, 2 right
    visited[start] = 1; // set gray
    while(!Q.empty()) {
        u = Q.front(); Q.pop();
        for(i=0; i < G[u].size(); i++) {
            v = G[u][i];
            if(partition[u] == partition[v]) return false;
            if(visited[v] == 0) {
                visited[v] = 1;
                partition[v] = 3 - partition[u]; // alter 1 and 2
                Q.push(v);
            }
        }
    }
    return true;
}

int main() {
    int i, u, v;
    scanf("%d %d", &n, &e);
    for(i = 0; i < e; i++) {
        scanf("%d %d", &u, &v);
        G[u].push_back(v);
        G[v].push_back(u);
    }
    if(is_bipartite()) printf("Yes\n");
    else printf("No\n");
    return 0;
}

As this algorithm traverses the graph it labels the vertices with a partition number consisted with the graph being bipartite. If at any vertex, algorithm detects an inconsistency, it shows with an invalid return value. Partition value of u will always be a valid number as it was en-queued at some point and its partition was assigned at that point. Partition of v will be unchanged if it is already set, otherwise it will be set to a value opposite to that of vertex u. So, if the algorithm returns true, it means, the graph is bipartite, i.e. bi-color-able, and partition[] holds the colors for each vertex.

Have fun, and let me know about errors, if any. [Note: The definition of bipartite graph is taken from this wonderful site]

Wednesday, January 6, 2010

Kruskal's Algorithm in C++

Minimum Spanning Tree, Kruskal's Algorithm

Kruskal's algorithm is an algorithm in graph theory that finds a minimum spanning tree for a connected weighted undirected graph. This means it finds a subset of the edges that forms a tree that includes every vertex, where the total weight of all the edges in the tree is minimized. If the graph is not connected, then it finds a minimum spanning forest (a minimum spanning tree for each connected component). This algorithm is based on greedy approach.

Performance

This algorithm needs to sort the edges and uses a disjoint set data structure to keep track which vertex is in which component. We know the best comparison sorting is O(e*lg(e)), i.e. the merge sort or quick sort, where e is the number of edges, and the set operations can be implemented such a way that they are almost constant. The algorithm itself is linear to the number of edges e. So the total complexity can be achieved is O(e*lg(e)). Also note that, e can be max v*v (when it is a complete graph).

Algorithm

Do some study and paper-work before you proceed:
1. the algorithm and analysis
2. another good pseudo-code
3. read in wikipedia
4. text: Introduction to Algorithms (CLRS, MIT press), Chapter 23.

Implementation

Here is a short and in general implementation in C++ using the STL library. This solves for one graph, if you need it for multiple graphs inputs, don't forget to reset the vectors and arrays appropriately.

Input

N, E // number of nodes and edges.
E edges containing u, v, w; where, the edge is (u, v) and edge weight is w.

C++ code


#include <cstdio>
#include <vector>
#include <algorithm>
using namespace std;

#define edge pair< int, int >
#define MAX 1001

// ( w (u, v) ) format
vector< pair< int, edge > > GRAPH, MST;
int parent[MAX], total, N, E;

int findset(int x, int *parent)
{
    if(x != parent[x])
        parent[x] = findset(parent[x], parent);
    return parent[x];
}

void kruskal()
{
    int i, pu, pv;
    sort(GRAPH.begin(), GRAPH.end()); // increasing weight
    for(i=total=0; i<E; i++)
    {
        pu = findset(GRAPH[i].second.first, parent);
        pv = findset(GRAPH[i].second.second, parent);
        if(pu != pv)
        {
            MST.push_back(GRAPH[i]); // add to tree
            total += GRAPH[i].first; // add edge cost
            parent[pu] = parent[pv]; // link
        }
    }
}

void reset()
{
    // reset appropriate variables here
    // like MST.clear(), GRAPH.clear(); etc etc.
    for(int i=1; i<=N; i++) parent[i] = i;
}

void print()
{
    int i, sz;
    // this is just style...
    sz = MST.size();
    for(i=0; i<sz; i++)
    {
        printf("( %d", MST[i].second.first);
        printf(", %d )", MST[i].second.second);
        printf(": %d\n", MST[i].first);
    }
    printf("Minimum cost: %d\n", total);
}

int main()
{
    int i, u, v, w;

    scanf("%d %d", &N, &E);
    reset();
    for(i=0; i<E; i++)
    {
        scanf("%d %d %d", &u, &v, &w);
        GRAPH.push_back(pair< int, edge >(w, edge(u, v)));
    }
    kruskal(); // runs kruskal and construct MST vector
    print(); // prints MST edges and weights

    return 0;
}

Have fun and please notify for any bug...

Thursday, December 31, 2009

Dijkstra's Algorithm in C++

Dijkstra's Algorithm

Dijkstra's algorithm is a single source shortest path (sssp) algorithm. Like BFS, this famous graph searching algorithm is widely used in programming and problem solving, generally used to determine shortest tour in a weighted graph. This algorithm is almost similar to standard BFS, but instead of using a Queue data structure, it uses a heap like data structure or a priority queue to maintain the weight order of nodes. Here is the algorithm and pseudo-code. You can also look into Introduction to Algorithms (by C.L.R.S).

Here is a short implementation of this algorithm in C++, I assumed that, all the edge-weights are positive, and the max possible distance is less than 2²⁰ which is set as INF. The nodes are marked from 1 to N inclusive where N is the number of nodes.

Input format:

N E // number of nodes and edges
E lines containing an edge (u, v, w) on each line // edge(u, v, w) means weight of edge u-v is w. Nodes u, v are within range 1 to N
S // starting node

C++ code:

#include <cstdio>
#include <queue>
#include <vector>
using namespace std;

#define MAX 100001
#define INF (1<<20)
#define pii pair< int, int >
#define pb(x) push_back(x)

struct comp {
    bool operator() (const pii &a, const pii &b) {
        return a.second > b.second;
    }
};

priority_queue< pii, vector< pii >, comp > Q;
vector< pii > G[MAX];
int D[MAX];
bool F[MAX];

int main() {
    int i, u, v, w, sz, nodes, edges, starting;

    // create graph
    scanf("%d %d", &nodes, &edges);
    for(i=0; i<edges; i++) {
        scanf("%d %d %d", &u, &v, &w);
        G[u].pb(pii(v, w));
        G[v].pb(pii(u, w)); // for undirected
    }
    scanf("%d", &starting);

    // initialize distance vector
    for(i=1; i<=nodes; i++) D[i] = INF;
    D[starting] = 0;
    Q.push(pii(starting, 0));

    // dijkstra
    while(!Q.empty()) {
        u = Q.top().first;
        Q.pop();
        if(F[u]) continue;
        sz = G[u].size();
        for(i=0; i<sz; i++) {
            v = G[u][i].first;
            w = G[u][i].second;
            if(!F[v] && D[u]+w < D[v]) {
                D[v] = D[u] + w;
                Q.push(pii(v, D[v]));
            }
        }
        F[u] = 1; // done with u
    }

    // result
    for(i=1; i<=nodes; i++) printf("Node %d, min weight = %d\n", i, D[i]);
    return 0;
}

This is a straight forward implementation, according to the problem solving purpose, changes are to be made here and there.

Here is another implementation that does not use a custom comparison structure and the code is a bit more re-usable too. Note, that, nodes are numbered from 1 to n, pair for priority queue elements are assumed to be (weight, node) where pair for graph edges are assumed to be (node, weight). Also I have added comments to make the code more readable. Visit this codepad url for uncommented version of the code.

#include <cstring>
#include <cstdio>
#include <vector>
#include <queue>
using namespace std;

typedef pair< int, int > pii;

/*
Set MAX according to the number of nodes in the graph. Remember,
nodes are numbered from 1 to N. Set INF according to what is the
maximum possible shortest path length going to be in the graph.
This value should match with the default values for d[] array.
*/
const int MAX = 1024;
const int INF = 0x3f3f3f3f;

/*
pair object for graph is assumed to be (node, weight). d[] array
holds the shortest path from the source. It contains INF if not
reachable from the source.
*/
vector< pii > G[MAX];
int d[MAX];

/*
The dijkstra routine. You can send a target node too along with
the start node.
*/
void dijkstra(int start) {
    int u, v, i, c, w;

    /*
    Instead of a custom comparator struct or class, we can use
    the default comparator class greater<T> defined in quque.h
    */
    priority_queue< pii, vector< pii >, greater< pii > > Q;

    /*
    Reset the distance array and set INF as initial value. The
    source node will have weight 0. We push (0, start) in the
    priority queue as well that denotes start node has 0 weight.
    */
    memset(d, 0x3f, sizeof d);
    Q.push(pii(0, start));
    d[start] = 0;

    /*
    As long as queue is not empty, check each adjacent node of u
    */
    while(!Q.empty()) {
        u = Q.top().second; // node
        c = Q.top().first; // node cost so far
        Q.pop(); // remove the top item.

        /*
        We have discarded the visit array as we do not need it.
        If d[u] has already a better value than the currently
        popped node from queue, discard the operation on this node.
        */
        if(d[u] < c) continue;

        /*
        In case you have a target node, check if u == target node.
        If yes you can early return d[u] at this point.
        */

        /*
        Traverse the adjacent nodes of u. Remember, for the graph,,
        the pair is assumed to be (node, weight). Can be done as
        you like of course.
        */
        for(i = 0; i < G[u].size(); i++) {
            v = G[u][i].first; // node
            w = G[u][i].second; // edge weight

            /*
            Relax only if it improves the already computed shortest
            path weight.
            */
            if(d[v] > d[u] + w) {
                d[v] = d[u] + w;
                Q.push(pii(d[v], v));
            }
        }
    }
}

int main() {
    int n, e, i, u, v, w, start;
    /*
    Read a graph with n nodes and e edges.
    */
    while(scanf("%d %d", &n, &e) == 2) {

        /*
        Reset the graph
        */
        for(i = 1; i <= n; i++) G[i].clear();

        /*
        Read all the edges. u to v with cost w
        */
        for(i = 0; i < e; i++) {
            scanf("%d %d %d", &u, &v, &w);
            G[u].push_back(pii(v, w));
            G[v].push_back(pii(u, w)); // only if bi-directional
        }

        /*
        For a start node call dijkstra.
        */
        scanf("%d", &start);

        dijkstra(start);

        /*
        Output the shortest paths from the start node.
        */
        printf("Shortest path from node %d:\n", start);
        for(i = 1; i <= n; i++) {
            if(i == start) continue;
            if(d[i] >= INF) printf("\t to node %d: unreachable\n", i);
            else printf("\t to node %d: %d\n", i, d[i]);
        }
    }
    return 0;
}

Dijkstra is a single source shortest path problem. So if you want to see the shortest path from every node, you have to call dijkstra once for each source or starting node.