Don’t do srand(time(0)) while testing!

This is my smallest post, but extremely important. In C or C++, before generating random numbers we often end up doing

srand(time(0))

So every time you run the program, you get new set of random numbers. Do you see the problem?

The problem is that the output is not reproducible. This is a big issue if you want to debug a particular test case.

But for some purposes, you just don’t need to reproduce output, in that case go ahead and use it. In some of my (blog) programs I am guilty of using srand(time(0)), please disregard it.

Posted in Programming | Tagged | 2 Comments

Some important GCC flags

GCC is a well known compiler for a number of languages. Here I will list and describe some important flags used with GCC, relevant for C and C++.

-o

This is used to set the output file. Notice that it is small case ‘o’, not zero or upper case ‘O’. If the flag is absent, the output file is a.out 

gcc -o myprog mycode.c
g++ -o myprog mycode.cpp

-g

This is used to produce debugging information with output. This information can be used by debuggers like gdb, for debugging the program. I will advice against using this flag with optimization flag (below).

gcc -g mycode.c
g++ -g mycode.cpp

-Wall

This is a highly recommended flag for all programs. It turns on all optional warnings. Such as “unused variable” warning. Needless to say you can ignore warnings, but a good programmer would want to know all warnings raised and fix them.

gcc -Wall mycode.c
g++ -Wall mycode.cpp

-O

This is the most important flag. It controls optimization. While it can’t convert O(n2) solution to O(n) ;) it certainly can reduce running time drastically. See time comparison below. Notice that it is capital ‘O’, not zero. There are 3 levels (or rather 4) of optimization-

  1. -O1 : This tries to reduce execution time and code size. Thus it takes more time than usual. One example of optimization – transform conditional jump into branch-less equivalent.
  2. -O2 : This optimizes even more than -O1. Hence even better performance and more compilation time. For most purposes I use this flag for optimization.
  3. -O3 : Optimizes even more than -O2.
  4. -Os : This is little different, in that it optimizes for code size instead of execution time. But smaller code size also helps to reduce execution time.
gcc -O1 mycode.c
g++ -O1 mycode.cpp

I compiled the program to find prime numbers (using Sieve of Eratosthenes) with different optimization levels, here are various execution times and executable size-
Without optimization : 1.835 sec, 25.2 KB
-O1 : 0.489 sec, 13.7 KB
-O2 : 0.472 sec, 13.7 KB
-O3 : 0.460 sec, 17.5 KB
-Os : 0.573 sec, 13.9 KB

Now I warn you, don’t jump to any conclusion from these stats, they are just for showing that these flags are not useless :P

-std=

Used to set the language standard. Both C and C++ have several standards, like C++11, C++98, C99, C89. This flag is absolutely necessary if you want to use C++11 features, as its features are experimental currently in gcc and not available by default.

gcc -std=c99 mycode.c
g++ -std=c++11 mycode.cpp

Below flags are less important, but still useful to know.

-D

Sometimes one needs to add #ifdef/#ifndef gaurds in code. For this I find -D flag extremely useful. It pre-defines a macro with definition 1. See example-

gcc -D DEBUG mycode.c
g++ -D DEBUG mycode.cpp

Above flag is equivalent to adding following line in C/C++ code

#define DEBUG 1

-pg

If you want to profile your code using gprof, you need to pass this flag.

gcc -pg mycode.c
g++ -pg mycode.cpp

These are not all flags, there are many many more. But I have covered most important ones. And I will add, if I find some other important flag. Enjoy!

Further reading : man page of gcc, optimize options

Posted in Programming | Tagged , | Leave a comment

Edit Distance using Dynamic Programming

Edit Distance is quite a interesting and popular problem. Here I present a very efficient bottom up C++ program to solve it.

Problem – We are given 2 strings. We have to find the “edit distance” or the cost of converting one string to other. We are allowed 3 operations – Insert, Delete, Replace. All 3 have equal cost that is 1 unit.

Example – Edit distance between “abc” and “abd” is 1. We replaced “c” with “d”. You can also see this as “Delete c”, then “Insert d” which would give a edit distance of 2. But since we try to minimize edit distance, we consider it as single operation that is Replace. Few more examples-
EditDistance(“ab”, “bc”) = 2
EditDistance(“man”, “woman”) = 2
EditDistance(“c”, “java”) = 4
EditDistance(“abcd”, “acd”) = 2

One of the above 4 examples has wrong answer! Tell me in comments which one is wrong,  and what is the correct answer ;)

We will use bottom up version of dynamic programming, as it is much cleaner and efficient (than top down). If the strings have length N and M, we will need a table of size (N+1)*(M+1). First we need to initialize the top row i.e TABLE[0][i] = i and first column i.e TABLE[i][0] = i. These are the base cases in which one of the string is empty. Now filling the rest of the TABLE column wise is simple-

TABLE[i][j] = min(
TABLE[i][j-1]+1,
TABLE[i-1][j]+1,
TABLE[i-1][j-1]+cost,
)
Here cost = 0 if STR1[i] = STR2[j], 1 if unequal

And the C++ function-

int EditDistance(string word1, string word2)
{
    int i, j, l1, l2, m;
    l1 = word1.length();
    l2 = word2.length();
    vector< vector<int> > t(l1 + 1, vector<int>(l2 + 1));

    for (i = 0; i <= l1; i++)
        t[i][0] = i;
    for (i = 1; i <= l2; i++)
        t[0][i] = i;

    for (i = 1; i <= l1; i++)
    {
        for (j = 1; j <= l2; j++)
        {
            m = min(t[i-1][j], t[i][j-1]) + 1;
            t[i][j] = min(m, t[i-1][j-1] + (word1[i-1] == word2[j-1] ? 0 : 1));
        }
    }
    return t[l1][l2];
}

The TABLE for “money” and “monkey” looks like->

TABLE

1 is the final answer.

Posted in Algorithm, Programming | Tagged , | 3 Comments

Merge Sort program in C

It is rather amazing, that many programmers are unable to write ‘Merge Sort’ correctly. With its guarantee of O(n log n) time complexity, it is a dependable sorting algorithm. Also it can be used to count number of inversions in a array of integers. Most importantly it is very easy to code! :)

So here is my implementation, of merge sort in C language -

/* 
 *We need 2 arrays 'a' and 'b', of equal size
 *Here 'a' is the primary array (i.e which contains initial 
     input, and final output)
 *And 'b' is the temporary array,
     used for merging 2 sorted half's in 'a' 
 */
 
void merge(int a[], int low, int mid, int high)
{
    int b[10000];
    int i = low, j = mid + 1, k = 0;
 
    while (i <= mid && j <= high) {
        if (a[i] <= a[j])
            b[k++] = a[i++];
        else
            b[k++] = a[j++];
    }
    while (i <= mid)
        b[k++] = a[i++];
 
    while (j <= high)
        b[k++] = a[j++];
 
    k--;
    while (k >= 0) {
        a[low + k] = b[k];
        k--;
    }
}
 
void mergesort(int a[], int low, int high)
{
    if (low < high) {
        int m = (high + low)/2;
        mergesort(a, low, m);
        mergesort(a, m + 1, high);
        merge(a, low, m, high);
    }
}
Posted in Programming | Tagged , , | 3 Comments

Fastest way to compute Fibonacci Number

While we all know that matrix exponentiation is asymptotically fastest way to compute NthFibonacci number, there is a still faster way-

If we know F(k) and F(k+1), then we can find

F(2k) = F(k)[2F(k+1) - F(k)]
F(2k+1) = F(k+1)2 + F(k)2

This is fast doubling. Here is my python implementation-

#Fast Fibonnaci
def fib(n):
    if n <= 2:
        return 1
    k = n/2
    a = fib(k + 1)
    b = fib(k)
    if n % 2 == 1:
        return a*a + b*b
    else:
        return b*(2*a - b)

if __name__ == "__main__":
    n = int(raw_input("Enter n(0 to quit) : "))
    while n:
        print fib(n)
        n = int(raw_input("Enter n(0 to quit) : "))

Note: I have taken the sequence as 1, 1, 2, 3… However some people consider the sequence as 0, 1, 1, 2, 3…

Posted in Programming | Tagged | 1 Comment

Difference between Debug build and Release build

Cool, after days and days of debugging, I’m ready to release my work!

Yeah nice, but what if your ‘release build’ crashes? Well read on, to see what might have changed.

Before knowing the differences between Debug and Release build, you should know that they are really different. I learned it the hard way. Just today, I wanted to show someone my half cooked software, so I quickly told my Visual Studio 2010 to make a release build for a demo. And I was really going to upload it on sourceforge, for anyone and everyone to download. Sweet, my system hanged.

Imagine what would have happened had it not crashed on my system (crashes are random, at the worst they don’t happen during testing), but on users computer. Uh oh, disaster.

So lets see what changes when you change from Debug configuration of build to Release build.

ImageAlso note that, there can be many more configurations, like “Debug DLL”, “Universal Release” etc. Its just how you configure it, and what name you give to your configuration.

1. Optimization

This is the main reason you package your final software with “Release” build. While debugging there are debugging symbols, extra lines of code, and the IDE’s book keeping attached with your software. So naturally your debug build is slower and heavier by several orders. For instance my debug executable was 6.15 MB while release executable is only 2.74 MB. See how it has doubled.
In fact there is an option in IDE to turn off optimization, as optimization by compiler, modifies code in unpredictable manner, something which can make debugging hard or even impossible.

Debug configuration has Optimization turned off

2. The protective cover is gone

If you are facing the trouble, “software crashes in release, while in debug it was running fine”, your trouble spot is most probably this one.
In debug mode, the compiler is gentle on you and provides a protection against your um.. erroneous code, by

  • allocating extra memory before and after, each block of storage. And this extra space is initialized with values like 0 for integers. Thus your attempts to access beyond array size won’t let you corrupt some other data structure.
  • it also initializes variables to some safe value. Like the pointers are initialized with NULL value, or some ridiculous value that will cause segfault, if you try to access it. Thus the problem of wild pointers would not appear in debug, but cause troubles in release build.

So if your release build is crashing, you should first check for out of bound access and uninitialized variables.

3. Use Release Library for Release build and Debug Library for Debug build

Well, this is where I faulted. I was using debug library, of framework my program was dependent upon for Linking. While configuring project settings, I made these settings default, and thus they were used for both Debug and Release build. As long as I was debugging, everything went fine, but my release build, well crashed…

Most third party libraries support different builds, like Release or Debug, Static or Dynamic and so on. So while, you are developing code, and will be debugging actively, it is appropriate to have debug libraries linked to your project. This will aid debugging as, the library will provide much more detailed messages for warnings and errors.
But in case of release, only use Release version of library for linking. Release build of libraries are optimized, and fail silently in case of error (you don’t want to show your users a scary message, right?!).

4. Test your program on various systems

Another major difference in debug and release, is that while debug build was playing in small backyard (your computer), the release build will be playing in open fields (users computer, or even smartphones, tablets etc.). You spend extra time, configuring your environment to ensure, your program runs. But on users system, you can’t do it. Can you?

You should test your program on different types of computers, and different operating systems, if your program is multi-platform. On some computers you find that there is some DLL missing. You may discover that your Swing application, looking beautiful on Ubuntu, looks messed up on Windows. You may notice that your software takes ages to open up, or that your game is running to fast, on some computers.

All this just increases your work. You may have to go back, add code or files to ensure that everything goes fine. But this extra effort just increases the usability of your software, which is very important.

That’s all for the differences. Best of Luck for your release build!

Posted in Programming | Tagged , , , , | 4 Comments

Starting Apache Derby Embedded in JAR Programmaticaly

In my last post Packaging Java application with Apache Derby as JAR Executable using Eclipse, the code that I used does not start the server at port 1527, by itself, when the JAR is launched. So I have added a few lines to accomplish this.

server = new NetworkServerControl(InetAddress.getByName("localhost"), 1527);
server.start(null);

(Entire code below)

One more thing that I missed out was, that on every users computer the Database is supposed to be created initially. Also the table, has to be created, if it does not exists. So the first time, a user launches JAR (the simple app we made) the database has to created as well as all the required table should be created and populated with content. (Do, read important points below)

So here is the entire code, which

  1. Starts the server
  2. Creates database
  3. Creates table
  4. Inserts values
  5. Queries the table
  6. Closes connection, and shuts down server
import java.net.InetAddress;
import java.net.UnknownHostException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import javax.swing.JOptionPane;
import org.apache.derby.drda.NetworkServerControl;


public class Test {
	
	public static void main(String[] args) {

	    NetworkServerControl server = null;
            Connection conn = null;
	    PreparedStatement prestat = null;
	    Statement stmt = null;
	    ResultSet result = null;

	    try {
			server = new NetworkServerControl(InetAddress.getByName("localhost"), 1527);
			server.start(null);
		} catch (UnknownHostException e1) {
			e1.printStackTrace();
		} catch (Exception e1) {
			e1.printStackTrace();
		}
	    
	    try {
	        Class.forName("org.apache.derby.jdbc.EmbeddedDriver").newInstance();
	    } catch (InstantiationException e) {
	        e.printStackTrace();
	    } catch (IllegalAccessException e) {
	        e.printStackTrace();
	    } catch (ClassNotFoundException e) {
	        e.printStackTrace();
	    }
	    try {
	        conn = DriverManager.getConnection("jdbc:derby://localhost:1527/MyDbTest;create=true");
	        
	        stmt = conn.createStatement();
	        try {
	        	stmt.execute("CREATE TABLE newTestTable(id int primary key, name varchar(20))");
	        } catch (Exception e) {
	        	//e.printStackTrace();
	        }
	        try {
	        	stmt.execute("INSERT INTO newTestTable VALUES(10, 'Hey,'), (20, 'Look I changed'), (30, 'The code!')");
	        } catch (Exception e) {
	        	
	        }
	        	        
	        prestat = conn.prepareStatement("SELECT * FROM newTestTable");
	        result = prestat.executeQuery();
	        
	        StringBuilder builder = new StringBuilder();
	        while (result.next())
	        {
	        	builder.append(result.getInt(1) + ", " + result.getString(2));
	        	builder.append('\n');
	        }
	        
	        JOptionPane.showMessageDialog(null, builder.toString());
	        
	        result.close();
	        result = null;
	        prestat.close();
	        prestat = null;
	        conn.close();
	        conn = null;
	        
	        server.shutdown();
	        
	    } catch (SQLException e) {
	        e.printStackTrace();
	    } catch (Exception e) {
			e.printStackTrace();
		}
	    finally{
	        if (result != null){
	            try { result.close();} catch (SQLException e){;}
	            result = null;
	        }
	        if (prestat != null){
	            try { prestat.close();} catch (SQLException e){;}
	            prestat = null;
	        }
	        if (conn != null){
	            try {conn.close();} catch(SQLException e) {;}
	        conn = null;
	        }
	    }
	
	}
	
}

IMPORTANT POINTS to note about above program

  1. The code is made small deliberately, hence not very neat. You should break down your logic into methods. Like for inserting values, there should be a separate method.
  2. SQL queries should be handled for exceptions. For eg, if you do not handle exception for query to “CREATE TABLE”, your program would work for first launch. But subsequent launches will terminate the application. Because the table is already present. Also “INSERT INTO” needs to be handled for exception, where duplicate rows might be getting inserted.
  3. Lastly your program, is not going to directly insert data, or search for it. You would be processing user data, and then doing query. I presented a ultra simple case.

And if you make your own great JAR app, don’t forget to share with all of us, here in comments :)

Posted in Programming | Tagged , , , , , , | 19 Comments