The Lotus Domino webserver does not care about the trailing slash (“/”) in URLs, leading to Duplicate Content problems which have a negative impact on SEO results. But now there is a safe, fast and elegant solution to this problem using DSAPI filters.
The problem: Domino does not care about trailing slash (“/”)
Working with websites running on Lotus Domino for many years, I have been able to make the Lotus Domino server do just about anything I want. I love it. It performs great, it is a fast development platform and it scales nicely.
However, I have had a problem with how Domino handles URLs. More specifically, the trailing “/”.
This problem only exists if you – like us – use Notes databases to serve web pages (as opposed to web pages served from the file system – who does that on Domino anyway?) and if you use substitution and redirection documents to replace the ugly Domino URLs with nice clean user-friendly URLs.
In this case, Domino does not care if a URL ends with “/” or not. It is by design in the Domino HTTP engine, and was made this way many years ago, probably to act more forgivingly (for developers?). Someone at IBM forgot to consider the search engines – or maybe search engines were not that popular at the time 🙂
The result of this is that Domino will serve the same physical page for 2 URLs, with and without a trailing “/”. So these 2 URLs will serve the exact same physical webpage:
(1) http://domain/pagename
(2) http://domain/pagename/
Or these 2:
(3) http://domain/section/pagename
(4) http://domain/section/pagename/
These 2 pairs of URLs are technically different URLs, also in the eyes of the search engines.
As Domino does not care about the trailing “/”, it is impossible to make redirection or substitution rules to correct the problem. Believe me, I’ve tried. If I just could serve a 404 (Page Not Found) for the incorrect version! But no, Domino simply does not care.
The effect: Duplicate Content
This is a problem and will result in what is called Duplicate Content. While our Content Management System is coded to use the correct URLs, some website visitors will “hack” the URLs in the browser address bar – or just guess the URL and write it directly.
Some will link to our websites using that incorrect URL, and not notice that they use the wrong URL, because the page is served anyway 🙁
The result is that a fraction of our backlinks goes to incorrect URL versions.
Search engines like Google are fighting hard to not have multiple pages with the same content in their index. Therefore search engines will pick one of the URLs (probably the correct one, as it has more backlinks and internal links), and disregard the other.
This is not good, because the value of the backlinks to the incorrect URL version is probably wasted (depending on the search engine), or at least not giving full value to the correct URL version.
This is where the technical problem becomes a business problem, because this behavior directly affects the results of our SEO (Search Engine Optimization).
Options to rewrite URLs in Domino
A while back, I contacted IBM asking them for a fix to this problem in the Domino HTTP engine. While IBM was helpful and understood the nature of the problem, they did not seem to share my level of concern regarding the business impact.
So there was no fix from IBM. I guess they do not get many (if any) requests from customers about this issue, as most of the companies out there running Domino websites do not know they have a problem.
“If a tree falls in a forest and no one is around to hear it, does it make a sound?” Sound or not, search engines know about these fallen trees.
Anyway, I needed a solution to the problem where I could make a 301 redirect (Permanently Moved) from the incorrect to the correct URL version. Or I needed at least to be able to send a 404 (Page Not Found) for the incorrect URL.
I considered the following options:
- Apache webserver with its “mod_rewrite” module installed in front of Domino
- Varnish Cache (a HTTP reverse proxy) installed in front of Domino
- Other reverse proxy products installed in front of Domino
- URL filters in our Load Balancers
- DSAPI filter installed on Domino
Apache webserver and the reverse proxy solutions could do this easily, but it would add to the complexity of our setup. And then there was the unacceptable added risk of operation: If the Apache or reverse proxy product goes down, the Domino websites would also be down.
URL rewriting could be done in the load balancers we use in our hosting. But as we do not have direct control of the load balancer configuration, maintenance of the URL rewriting would be troublesome.
Best candidate for us proved to be the DSAPI filter.
The solution: DSAPI filter to rewrite URLs
Knowing what I know today, I would not have hesitated choosing the solution with the DSAPI filter. It is fail-safe, extremely fast and configurable.
A DSAPI filter is a C API, coded in C/C++, compiled to a DLL in the file system of the Domino server and loaded by a Domino Web Site document.
The DSAPI filter will process URLs before the Domino HTTP engine, acting a bit like Apache’s “mod_rewrite” module.
DSAPI filters are loaded by websites and do not affect the entire web server. So one can choose to have some websites using a DSAPI filter and some not – very convenient.
Documentation and examples for developing DSAPI filters are close to non-existent, and my developers had a hard time starting from scratch with this. Therefore, we decided to get some assistance coding this filter.
I turned to Tron Systems in the UK, which have been on my bookmark list for some years because of their skills in this particular area.
My experience with Tron Systems is worth its own write-up: just remarkable.
Tron Systems in the UK coded the DSAPI filter
and impressed me deeply in the process
Tron Systems made the filter work with a simple Notes configuration database – from where the DSAPI filter will read some settings and URL exceptions while loading. They delivered the source code and documentation, which means we are able to make changes to the filter on our own.
The filter is coded to return a 301 Permanently Moved redirect to the correct URL, in case the incoming URL pattern matches the setting in the configuration database.
The compiled DLL file – which we call the DSAPI filter – will need to be in the Domino servers file system. We instruct Domino to load the DSAPI filter by entering the DSAPI filter file name in the Website document (the file extension ‘.dll’ is not needed):
From the Website document in Internet Sites view in names.nsf.
“urlrewriter” is the file name of the compiled DSAPI filter placed in the Domino servers file system
A Domino server restart or a Tell HTTP Quit/Load HTTP sequence is needed to get Domino to load the DSAPI filter.
All our websites use the same configuration database. We changed the UI of the database, and added a “No action” option on the configuration form:
From the configuration database for the DSAPI filter,
right now set to remove trailing slashes
We have chosen the setting “Remove trailing slashes” because they are a little simpler for users than URLs ending with “/”. With the DSAPI filter in action, this is what happens:
(1) http://domain/page -> no rewriting -> http://domain/page
(2) http://domain/page/ -> 301 -> http://domain/page
The DSAPI filter in function: Firebug waterfall chart showing the 301 redirected HTML document from the version with trailing “/” to the version without trailing “/”
The beauty of this DSAPI filter solution is that the Domino websites will keep running and the web pages will still be accessible, even if the filter does not load, or if the configuration database is not available.
And it is fast too. The impact on page load speed is below 2 milliseconds, too small to be detected by the page load speed tools I use. I have not been able to register impact on webserver memory consumption – so it must be small.
So there you have it, an efficient solution for the trailing “/” problem in Lotus Domino. Perhaps an idea for IBM to incorporate this DSAPI filter as optional setting in a coming release of Domino?