Monday, October 1, 2007

12. WebSpider review


Reviewed Package Author: JianFei Liao

1. Installation Review:
I downloaded JianFei's code and extracted it with ease. The installation also went well.

Test results:
JUnit: the test runs with 2 failures. The failures are found in testFindTotalLinks and testFIndMostPopular.

Checkstyle: the test runs successfully with no errors.

PMD: the test runs successfully with no errors.

FindBugs: the test runs successfully with no errors.

Jar File Construction and execution: a jar file was built successfully. However, it was named named webspider.jar instead of webspider-jianfei.jar.

I was able to invoke "java -jar webspider.jar -totallinks http://www.hackystat.org 100" and it returns 3257.

2. Code format and conventions review






















File Lines Violation Comments
WebSpiderExample.java 117,118, 254, * EJS-7 Unnessary blank lines within a method
TestWebSpiderExample.java16,49,100 EJS-7 Unnessary blank lines within a method
TestWebSpiderExample.java19,20,21,* EFS-39 Document all private members


3. Test case review
Black box perspective:

All methods and classes in WebSpiderExample were tested in
TestWebSpiderExample.java. However, two of the tests failed as indicated by JUnit.

For the class WebItem, only increaseCounter() is tested while getCounter() and getLink() are only indirectly tested by calling the main method.

No tests for boundary cases are included. It should consider checking the following cases:
- crawling through 0 page
- crawling through -1 page
- crawling through 9999999999999 pages
- entering an invalid URL such as http://hackystat
- entering no URL
- crawling through a page with no links

The best test to do with this program is to create a test site so that we can compare the results.

White box perspective:
Running Emma produces the following coverage summary:
class: 100% (2/2)
method: 100% (9/9)
block: 79% (476/602)
line: 85% (99/116)

This indicates that some blocks and lines are not covered. This can be improved by adding the following test which crawl through more pages:
String[] testArgs7 = { totallinks, testStartUrl, "10", logging };
WebSpiderExample.main(testArgs7);
By employing this, block and line coverages are improved to 81% and 90% respectively.

Break da buggah:
I tried to invoke "java -jar webspider.jar -totallinks http://httpunit.org 9999999999999" but it returns the warning: "Argument for number of pages to crawl is not an integer." It also returns the invalid result "The total number of links discovered while crawling the first 0 pages accessible from http://httpunit.org is: 0".

4. Summary and lessons learned
Overall, this is a well-written program with throughout exception handling. Even though it is hard to determine if the program is returning the correct number of links, I tested it against Randy Cox's test site and it passed the test by returning the correct results. So it does a good job on that.

Exceptions were handled well. I have tried to break the code with a lot of test cases like entering negative or decimal argument for pages to crawl, invalid URL, employ URL, URL with no links etc. The program manages to catch all the exception. However, no appropriate error messages are displayed. This is something that can be improved. I also tried to run the program with my friends' myspace and xanga account, google.com and a few other websites. I was actually surprised that it doesn't crash at all which indicate that he did a good job with exception handling. On the whole, this is a well-done program.

From this exercise, I have learned that testing is important. When I did my assignment, I had no intention of implementing black box and white box perspective in my tests. After implementing them on reviewing other's code, I realize that this would make sure I was not testing randomly and to ensure maximum coverage of my tests.

I also gained more experience with working with JUnit, PMD, Emma and Checkstyle. Even though I am still puzzled at times by how to ensure maximum coverage of Emma, I am at least more familiar with the other QA tools now.

Further more, by comparing JianFei's code with mine. I realize that my code could be improved by removing some redundant codes. He employed a similar algorithm as mine but his was more organized and concise. This is what I have to work on with my code.

No comments: