Wednesday, April 30, 2008

“please wait while scripts are loaded…”

I really need to write this… I spent, wasted more than one full day just to sort this small thing…

We implement our corporate page template into a MOSS masterpage.
Copy and paste the html code. Try it first with a blank page layout, not even the page content area. Fine. Looks good and easy.

Add the page content field to the center of layout, to start with.
This time, after creating a content page based on the layout, we type in some text into the page content area at the center.
Do the Preview from the Tools menu. The text I typed DISAPPEAR!! Try to save without previewing. The same. The text is gone… From here, my long debugging exercise started.

It is fine without our page template, the HTML code pasted into the masterpage.
With the template, I noticed that when change the page to edit mode, the status bar of the browser reads “please wait while scripts are loaded…”
Looks something going wrong with javascript codes generated.

A Sharepoint generated page contains tens of thousand lines of javascript. You need a debugger like FireBug to debug this.
We need to either 1 authenticate FireFox against Sharepoint or 2 find a good javascript debugger for IE. Luckily I found the later.

Thanks god! The debugger found the line causing the error. It was in a js file called /_layout/1033/HtmlEditor.js, at its line 5772.

var displayContentElement=document.getElementById(clientId+"_displayContent");

var findForm=displayContentElement;
while (findForm.tagName!="FORM" && findForm.tagName!="WINDOW")
{
findForm=findForm.parentElement;
}
findForm.attachEvent("onsubmit",new Function("RTE2_TransferContentsToTextArea('"+clientId+"');"));

At run time, the clientId+"_displayContent” is the id of <div>
element for the page content textarea. This code adds an event handler for the form submission, to save the text typed. However, with our page template pasted, the findForm.parentElement does not return. It can not find the <form id=”aspnetForm”> element to add the event handler. Why?

I had the impression that it is probably the <form> element for the search box that we have with the page template, though it does not really make sense because it is properly closed, should not interfere others.
But it was it. Having removed it, my simple page with our corporate page template now works. We think about how to have the search box back later.


To me, the lesson here is that, if you are not lucky and trapped with a problem such as this, it is really difficult to debug it with Sharepoint MOSS.

Tuesday, April 29, 2008

Feature and Solution, why we need them?

A Feature is a customization unit that you could activate on a selective unit of SharePoint; a site, a web application, a farm etc. And Solution is packaging framework for Features.

Actually, some built-in “features” of SharePoint are delivered to you as Features.
Built-in masterpages and layouts for Publishing portal for instance are found at C:\Program Files\Common Files\Microsoft Shared\web server extensions\12\TEMPLATE\FEATURES\PublishingLayouts\MasterPages\.

Likewise, if you create a Solution package containing the masterpage you designed as a Feature, you could deploy it (quite easily indeed) to other instance of SharePoint.

What happens if you do NOT use Feature and Solution? Your customizations? Content types you created with the browser-based interface, layouts you designed with SharePoint Designer etc?

Answer (at least of the moment) is that customization done in that way i.e. thru browser and SPD, goes into database, from which you can not extract and deploy them selectively to other instances of SharePoint. For example, you design your masterpage with SPD, to database. You can not make it available to other instances of SharePoint than the one you created it against, unless you copy and paste the codes manually.

Even worse for contents types. You can not do any copy and paste with them.

For Features, you define them in XML. There are the schema defined.
However, what is painful is that you have to do this all manually. There is no tool yet available to generate them from content database.


Having said that, I do not know if this would be a (major) problem for people like us who maintain own web sites, not consultants or anything who develop solutions for clients.
We have a staging site and production site. Between, we establish Content Deployment job(s). It has been already proven that this “publishes” masterpages and layouts too; everything necessary for content pages to appear correctly, except for assemblies and other file-based customization.
So, what’s wrong if we do all customization to the staging site using browser interface and SPD and they all appear nicely on the production site, without having to worry about deployment etc?


And even, I now wonder if we still need this old "staging site and production site" concept, structure or whatever you may call it.
SharePoint presents a page differently, approved version, draft etc. depending on who you are, where you come from. We might go with just one site.

We may still need the (BROKEN according to many) Content Deployment framework though to distribute contents among servers in a farm. I have not looked into this yet. Will share with you the experience once I had.

Follow-up on July 7 2008:
No, you do not need to setup a Content Deployment to sync contents among a farm.
Actually, it it nothing but obvious. For a farm, you have just one content database (per web application). All web servers in the farm connect to it. You do not need to synchronize anything.

Monday, April 28, 2008

Content Deployment “Timed Out”

When gone into daylight saving time (DST: this is really IT guys nightmare…) at 2am March 30th 2008, the content deployment stopped working. The result was always “Timed Out”. BTW, to find this, that it may be because of the time change, it took days already….

I found this hotfix
http://support.microsoft.com/kb/938663/.
SharePoint’s own timer service, or shceduler, is so smart that it does not notice the time change. Please… Thus, a job scheduled to run at 8am, will not run till 9. When it finally runs, it is already “timed out.”


Since the hotfix in question (http://support.microsoft.com/kb/938535/) is included (http://support.microsoft.com/kb/942390), decided to go for SP1 instead.

One more final remark. Description of the hotfix reads as follows.
“The Windows SharePoint Timer service does not update its internal time when Microsoft Windows makes the transition from standard time to DST or from DST to standard time. Therefore, after you apply this hotfix, you must restart the Windows SharePoint Timer service after each transition from standard time to DST and after each transition from DST to standard time. If you do not restart the Windows SharePoint Timer service, timer jobs that you schedule may be delayed. Or, they may fail.”

Thursday, April 24, 2008

Web Statistics; NetIQ WebTrends vs Awstats

The organisation I work for uses NetIQ WebTrends.
We bought it some years back and have not upgraded it ever since, so I do not know its latest version. The version we use is its eBuisiness edition, version 6.1.

The first thing I noticed was the long time it requires to process log data, especially after we enabled the geographical analysis. Several minutes for just one day.
Our site is a rather busy site, had 1,952,538 hits on Monday for instance, which means you have such number of records in a log file to process. # Actually, it first has to unzip it. This takes some time already.
So it may sound understandable, you may even think it is doing a good job and ask what is the problem.

The problem is that it needs to process the log files, each time you create a new “profile”, to analyze a new thing.
With a profile, you define the “filter” among other things. A filter typically is a URL path, of which you want to have statistics.
For example, this blogger.com server hosts many sites including mine, murmurofawebmaster.blogspot.com. To prepare access statistics of my site, you have to “filter” accesses to this site from the whole gigantic blogger.com server log data.

Imagine that you are towards end of the year, and need to prepare statistics for one particular URL, of the whole year.
For me, it takes even like 3 days. Today, people will not understand that, they will think you are damn.

So we looked at possible alternatives, namely Awstats. I think WebTrends and Awstats are the most famous two in this domain of web log analysis.

First, we listed points that we like and dislike WebTrends.
Good points/Advantage:
G1. Delegation of management
G2. Present geographical distribution of access

Points we dislike:
B(ad)1. A “profile” has to be created in order to see statistics of a subsite independently
B2. Take time for a profile to be ready to be seen
A colleague even said that the way we use it is probably wrong, and there should be a way to make it ready instantly.
B3. The way the profiles are presented/organized. They are just being added chaotically…
B4. Really a black box. An analysis fails without a trace of why it failed.
B5. For something not documented, or that you can not find the explanation, now stuck, because of no valid license. # We decided not to keep its support contract.

Awstats. A perl based freeware.
G1: There is no concept of delegation of management. You administrator need to edit its text config file. -> Disadvantage
G2: Its documentation says possible. But I did not try. The environment I used for this evaluation misses not just one but several required perl libraries. -> Let us say Equal

B1, B2 and B3 -> Equal, or WebTrends is slightly better to me.
I think processing of log data is a time consuming task with all packages in general.
I fed just one day log to Awstats and it took some time (like 10min or so, even more) for it to “digest” even that.

And to have statistics for a subsite, just like with WebTrends, you have to have the specific “profile”, process the log data against the profile and have the result in a separate data store, a database of some sort.

The main difference I found between WebTrends and Awstats with respect to ways to handle series of log data is the following.
With Awstas, it is your responsibility to make sure not to feed it with the same data twice. I think this can be a bit tricky especially when you want to (re-)analyze all past logs. You yourself have to do some programming to achieve this. It is only simple when you feed logs as they are created.
On the other hand, WebTrends remember the full path of log files it has already processed. Therefore, you normally specify a folder and tell it to process everything there. Very simple too when processes all log of this year, today for instance.

B4 and B5 -> Equal
We can not really tell, until we seriously start using it, whether the Awstats is robust, a good set of info for trouble shooting are handy etc.
However, in case of a problem, as with the case with other popular open source packages, I think we could find on the net, people already had that problem and overcome.
With WebTrends, on the contrary, we can not rely on the net. But if you have the support contract, the support guys are there to help you out of whatever problem.

In short, we did not see any significant gain we could have with Awstats, that we do not have with WebTrends. So we decided to stay with WebTrends. We have been using it for some years already (though with some un-satisfaction), and thus we have some level of understanding, know-how, experience etc with it.

Google Analytics

How many visitors do I have? Looks like this blogger.com does not tell me that.

So I decided to try the Google Analytics. Luckly blogger.com does allows us to insert the required javascript snippets into our blogs.

So far (as anticipated) number of hit I have with this site is really, really small. I need to put more interesting stuff here to draw your attention.

Tuesday, April 22, 2008

Cynthia Says

This morning I went to a workshop. It is about to make web site accessible to people with disabilitty.The presentor was Cynthia of http://www.cynthiasays.com/, looks like a very famous person in this area.

I have worked on the topic a little already. My boss asked me to go through the W3C standard and do a summary.It is about stop using <table> to layout the page, or put the labels properly to all input fields of your form etc.

So not many new things came out today’s session. Still, I picked up these interesting from the Q&A session.

PDF is a image, not readable by screen readers.
There are some improvement made by Adobe recently but it is still basically a snapshot image of a document.This is really new to me. I was thinking that it is more like PostsScript than snapshot.

As you may have heard, the Captcha technique is becoming an issue in this context. Our site uses it too.
Today, she presented a couple of possible solutions to it, which include to have audio file next to it. Sounds cool to me.

Friday, April 18, 2008

Massive SEO poisoning

The story started this blog post. http://ddanchev.blogspot.com/2008/04/unicef-too-iframe-injected-and-seo.html
A guy in the organization I work for, who can be noisy at times, happens to be a subscriber of the feed and bought this to our attentions.
I am the webmaster there, officially titled to be so. So I have to do, or say at least something, when things are brought up this way.

I read the post, hard to understand. Frankly, I do not know if I got the full idea still.
SEO (I did not know the abbreviation) seemingly stands for Search Engine Optimization. In short, in this context, it points to the fact that search engines give higher ranking to pages from “high profile sites”.

Then, IFRAME injection (I did not know that this was getting that popular either) basically is to inject malicious contents using the well-known XSS (Cross Site Scripting) vulnerability.

So I said to the guy that for our site, the XSS was looked into, so we are safe. In reality, you can not be really safe. But you need sometime to be diplomatic, bureaucratic…

The one thing I still do not really get is that, then how to have those injected URLs indexed by google.
According to some posts I found on the net, those malicious guys publish millions of pages tagged with keywords, where they have links to those injected URLs. Google robot comes and is tricked that the injected URLs mentioned at many place for those keywords. It indexes it with a high ranking because it is from a “high profile site”.

Thursday, April 17, 2008

FTP : Active vs. Passive

I keep forgetting this but the occasions I need to understand it keep coming back. Like when I programmed a simple FTP web-based client (I expected I could find a free AJAXed control or something like that, but could not) or help our network guys to do some maintenance. So let me have a brief summary of things good to know/remember so that I do not have to google it yet again…

This is really nice summary I found at http://www.cert.org/tech_tips/ftp_port_attacks.html.
“A client opens a connection to the FTP control port (port 21) of an FTP server. So that the server will be later able to send data back to the client machine, a second (data) connection must be opened between the server and the client.

To make this second connection, the client sends a PORT command to the server machine. This command includes parameters that tell the server which IP address to connect to and which port to open at that address - in most cases this is intended to be a high numbered port on the client machine.

The server then opens that connection, with the source of the connection being port 20 on the server and the destination being the port identified in the PORT command parameters.

The PORT command is usually used only in the "active mode" of FTP, which is the default. It is not usually used in passive (also known as PASV [2]) mode. Note that FTP servers usually implement both modes, and the client specifies which method to use [3].”

Then what is the passive mode? I think this is compact and quick to read but detailed enough, from http://slacksite.com/other/ftp.html#passive.
“In order to resolve the issue of the server initiating the connection to the client a different method for FTP connections was developed. This was known as passive mode, or PASV, after the command used by the client to tell the server it is in passive mode.

In passive mode FTP the client initiates both connections to the server, solving the problem of firewalls filtering the incoming data port connection to the client from the server. When opening an FTP connection, the client opens two random unprivileged ports locally (N > 1023 and N+1). The first port contacts the server on port 21, but instead of then issuing a PORT command and allowing the server to connect back to its data port, the client will issue the PASV command. The result of this is that the server then opens a random unprivileged port (P > 1023) and sends the PORT P command back to the client. The client then initiates the connection from port N+1 to port P on the server to transfer data.”

Wednesday, April 16, 2008

Develop custom authentication module

Last week, I enjoyed developping a DNN module specific to our own need.
I do not know how I can best describe it but… the DotNetNuke seems to be a rich framework that we can develop our own customizations on top.

Things, or facilities it exposes for our use inlucde the ease in having properties of the module.
Each module has the end-user interface and configuration interface for the admins to set it up.
You design the configuration interface, listing textbox, checkbox etc for properties of your module. And in the code behind class, you override one method which is called when the admin, after filling those textbox, checkbox etc, clicks the link to save the config change.

I mean, for all these, I do not have to worry about. The framework is there.
What you have to do is just to, taking one of the exsting modues as sample, change the design of the config or enduser intergface or both user controls, and code the event handler method(s).

In addition to that, there are fuctions provided by the framework to save and pick up the value given to a property, into and from database.

I do not know if all this is documented somewhere in a publicly available document. Will let you know if I found it.