Archive for March, 2007

Search Engine Friendly URLs for Java Web Application

Wednesday, March 28th, 2007

Static and Dynamic URLs

Any web-application has static and dynamic resources. As the names imply static resources are those that are never changed.

For example, html pages with no dynamic data, static download files etc.

Dynamic resources are those that can change their content from time to time: html pages that contain dynamic data (such as results of search queries), dynamic reports, download files that may change their content depending on the visitor’s preference and so on.

Usually we may easily distinguish static and dynamic URLs:

http://www.mysite.com/pictureOfMyDog.jsp - static URL.

http://www.mysite.com/picture.jsp?id=234&operation=show - dynamic URL.

URLs and Search Engines

Search engines like static URLs. When the search engine spider comes across a dynamic URL it will or will not (may and may not) follow it. Depending on the internal algorithms, the search engine will choose the most optimal way to go.

If your site is popular and the search engine knows about it, it can index your dynamic resources.

If later the search engine finds out an old dynamic URL does not work any more or leads to a different page it can stop indexing such URLs.

The more parameters your URL has the less are the chances the search engine will follow such URLs. The spider can follow them to check out what kind of content the pages contain and if there is any useful content at all.

URLs and Internet Surfers

What if someone likes the picture of your dog located at this URL:

http://www.mysite.com/picture.jsp?id=234&operation=show.

The poor guy bookmarks the page as he will never remember such an awful URL and thus will never show that picture to his girlfriend at her place. Later on when they come to the Poor Guy’s place and he wants to show that picture and he clicks the bookmark they miserably stare at a 404 page.

It happens because you decided to do a major rewrite of your code and now the picture of your beautiful dog is located here:

http://www.mysite.com/picture.jsp?photoId=234&operation=show

Moreover, if they do not find the picture, the night is spoiled! Only because the webmaster has never thought about static URLs.

URL Friendliness and Intranet Applications

As the value of static URLs is clear for the web application exposed to public in the Internet, some may think they are useless for the Intranet applications. The search engines do not index those applications. The URLs usually do not change. Well, I though that too. However, if you think about it a little more you will find out that it can be of great use for your application as well.

First of all this is just a plain aesthetic pleasure. If you do not care about nice URLs, think about the developers who can save time and probably someone’s money by typing shorter URLs and by making fewer mistakes in those URLs.

On my day job, it happens that I have to browse the application we are developing on the computers of other developers through the network. In addition, every time I have to type the URL of the login page with the developer’s machine network address. Moreover, every time this is a lot of pain. This URL is long and complicated. And this is the URL that everyone has to type pretty often, as there is no way to save all of them in an ugly IE bookmarking facility. This is the first example I can think of but I am sure that if you think about that a little bit you will find many reasons to have static URLs in your applications.

Java and Search Engine Friendly URLs

The users of Apache HTTP server are happy to have to have such functionality almost out-of-box. There is no standard solution for J2EE platform. Fortunately, there is a project that let you have the desired functionality. This is called Url Rewrite Filter.

“Based on the popular and very useful mod_rewrite for apache, UrlRewriteFilter is a Java Web Filter for any J2EE compliant web application server (such as Resin, Orion or Tomcat), which allows you to rewrite URLs before they get to your code. It is a very powerful tool just like Apache's mod_rewrite.
URL rewriting is very common with Apache Web Server (see mod_rewrite's rewriting guide) but has not been possible in most java web application servers”.

Visit their web site and download the filter code, documentation and manuals.

The documentation of this filter is great and the usage is pretty simple, however I will show you an example:

I have a long ugly URL:

http://www.mysite.com/SoftwareList.do?operation=showList&chapterId=X

I want it to look nice and to be search engine friendly:

http://www.mysite.com/category-programs/audio-and-video-/X

We compose a rule:

XML:

  1. <rule>
  2.         <from>^/category-programs/(.*)/([0-9]+).*$</from>
  3.         <to>/SoftwareList.do?operation=showList&amp;chapterId=$2</to>
  4.     </rule>

And place that rule to urlrewrite.xml.

Remember, you have to encode all relative URLs, otherwise the paths to css, js, images and all other paths will be corrupted. So use jstl’s tag for all your relative paths.

Alternatively, you may add ‘redirect’ attribute to the rule:

XML:

  1. <rule>
  2.         <from>^/category-programs/(.*)/([0-9]+).*$</from>
  3.             <type="redirect">/SoftwareList.do?operation=showList&amp;chapterId=$2</to>
  4.     </rule>

But then the user will see this ‘ugly’ URL in the address line of the browser.

Conclusion

Url Rewrite Filter is a great way for J2EE developers to add static url functionality to their applications. It is simple and easy, the configuration file is automatically reloaded occasionally (you define the interval). That is it for now. I have written an URLAbstractor class and a custom tag to make clean URLs out of any String.

URL Beautifier

I have written a small class to convert a string to a pretty URL :

JAVA:

  1. package com.leadercode.tag.url;
  2. import java.io.UnsupportedEncodingException;
  3.  
  4. /**
  5. • URLAbstractor
  6. • @author Sergey Nechaev
  7. *
  8. */
  9. public class URLAbstractor {
  10. public static String encode(String url) throws UnsupportedEncodingException {
  11. StringBuffer out = new StringBuffer(url.length());
  12. for (int i = 0; i <url.length(); i++) {
  13. int c = (int) url.charAt(i);
  14. switch © {
  15. case ‘ ‘:
  16. case ‘&’:
  17. case ‘,’:
  18. case ‘.’:
  19. case ‘:’:
  20. c = ‘-‘;
  21. break;
  22.             }
  23.  
  24. if (c == ‘-‘ && i> 0 && out.charAt(out.length() - 1) == ‘-‘) {
  25. continue;
  26.             }
  27.  
  28. out.append((char) c);
  29.         }
  30.  
  31. return out.toString();
  32.     }
  33. }
  34.  
  35. SEF Tag
  36. package com.leadercode.tag.url;
  37. import javax.servlet.jsp.JspException; import javax.servlet.jsp.tagext.BodyTagSupport;
  38.  
  39. /**
  40. • The URL abstractor tag
  41. • @author Sergey Nechaev
  42. *
  43. */
  44. public class SefLink extends BodyTagSupport {
  45. private String url;
  46. public String getUrl() {
  47. return url;
  48.     }
  49.  
  50. public void setUrl(String url) {
  51. this.url = url;
  52.     }
  53.  
  54. public int doStartTag() throws JspException {
  55. try {
  56. pageContext.getOut().print(URLAbstractor.encode(url));
  57. } catch (Exception e) {
  58.  
  59.         }
  60.  
  61. return SKIP_BODY;
  62.     }
  63. }

First Day of Game Development

Monday, March 19th, 2007

Today I have suddenly realized that I do already have everything to start writing a game. There is a basic scene manager, sprites, particles in my engine. With the help from the guys in LWJGL IRC channel, I managed to plug Mappy library to LWJGL and now it is really simple to have an animated tiled map. For music and sound, I would probably take Audiere library, as it is free though Windows only. Therefore, today is the first day of the GAME DEVELOPMENT :)

Software Development Can Be Fun

Monday, March 19th, 2007

Now the developer has to know a bunch of different technologies, systems and libraries. And each day you start your favorite IDE, HTML editor, XML editor, a couple of utilities along with Outlook Express program and run a server. The computer is as slow as hell and you start cursing the developers and the computer.

I am in the development of the presentation tier for the large contract management application writing some java code, composing some HTML & JSP pages and writing some crazy client-side JS components, writing and fixing long XML and .property files.

We use an excellent Eclipse IDE along with a set of plug-ins known as MyEclipse. This is a very powerful system, however most of time (I mean always) the only plug-in we use is the nice syntax highlighting of JSP, JS & XML files.

Once I asked the colleague to help me fix some annoying bug in JSP code. He logged in to my server, opened the page in IE, right-clicked it and opened the HTML source in a notepad. No, it was not a standard grey ugly windows notepad. A nice editor highlighted the source code. We used its folding option to find the bug – I forgot to insert the closing JSP tag. The HTML source looked great!

Notepad++

I asked to give me this editor (Notepad++) and installed it on my computer. A couple of seconds to create proper file associations using the Notepad++’s convenient menu and it is ready to go!
Notepad++ highlights many file formats, it starts very fast as it was written in C++ and is very small (less than 1 Mb). Each new file is opened in a new tab, and when you close the notepad and reopens it – it loads the files that had been opened!

There is a number of text processing plug-ins, some TextFX plug-ins. I did not look into it, but at a glance, I liked the HTML/XML auto-closing feature.

Now I use it on daily basis and let me tell you this is the only program I really like 

Here are the features of Notepad++ (taken from the web site):

Syntax Highlighting and Syntax Folding
User Defined Syntax Highlighting
Auto-completion
Multi-Document
Multi-View
Regular Expression Search/Replace supported
Full Drag ‘N' Drop supported
Dynamic position of Views
File Status Auto-detection
Zoom in and zoom out
Multi-Language environment supported
Bookmark
Brace and Indent guideline Highlighting

View Screenshot

Google and Search Problems

Sunday, March 18th, 2007

search enginesToday my friend and I were looking for software that could make a trialware version of our program. We have lost the source code of the program as it was written about 3 years ago and we cannot use popular software protection systems or cut some program’s features to make a limited demo version.

We need a program that will take our executable, perform some voodoo magic and give us back an executable that will function on the user’s machine for say 14 days, or 10 launches or something like that.

While googling we have found no traces of existing of such system. Maybe we do not know exactly how to build a proper search query, as we do not how these systems are called. And I know that there are such programs. I just cannot remember them right now, so will have to search for advice in forums or in a conference.

I thought that there could be probably a big failure of many businesses using their web site and thus search engines as their very important marketing and selling channel. There is really a big gap between the minds of the professionals selling some solution and the users looking for that solution.

If you are a company selling some sort of poison killing BacteriusHlomidonados you can do just fine as long as there are many people looking for BacteriusHlomidonados. However, I am sure there are thousands (including me) looking for ‘some stuff to kill those nasty little bugs in my kitchen’. I do not know they are called ‘BacteriusHlomidonados’ and the company selling great its stuff has no idea that ‘BacteriusHlomidonados’ is in fact ‘a nasty little bug’.

Now I think of my web sites and the logs. For many years, I have been coming across really weird search queries that led people to my web site. They were weird for me, but absolutely normal for those people. The knowledge of the target audience is probably one of the most important things in online business. And what is more important, the knowledge of their language and wording.