hross: July 2007 Archives

What's in a Login Token?

| | Comments (0) | TrackBacks (0)

Sitting here in the DC airport, waiting for a flight which seems to be indefinitely delayed, it struck me that I should be doing something constructive. In the past I've done quite a bit of mucking around with portal security, and one of the things I always wondered about were portal login tokens. So, just in case you happen to get stuck playing an ALUI version of Trivial Pursuit, here's what I know...

What is a login token?

Login tokens are basically session keys that are used to instantiate user sessions in the portal. Put another, less complicated, way: they enable logins to the portal without a password.

Where are login tokens used?

The server API contains methods for instantiating a user session from a login token. These tokens are passed to the API for validation and are used anywhere an automated method is necessary for authenticating a user across application boundaries.

That sounded complicated again... how about I just list their uses:

  • The automation server uses a login token to run jobs as different users.
  • The administrative server uses them to authenticate users who click the Administration link in the portal (this only happens if you have a separate administrative server).
  • The Default Profiles administrative utility and some custom impersonation spaces PCS has written use them to impersonate other users on the fly.
  • Remember my Password sets a cookie in your browser with a login token.
  • Login tokens are sometimes passed to portlets using the IDK, then used by the PRC to validate user sessions (see one of my previous posts)

How are login tokens built?

  1. Current UTC time in seconds is retrieved from the server.
  2. Token expiration is calculated based on the number of seconds from UTC time and token lifetime.
  3. A string is created that looks like: User ID + token delimiter + expiration (step 2) + token delimiter.
  4. A hash of (3) is taken, with the key being the Login Token Key generated when you push the 'Update' button in Administration | Portal Settings (read the directions before you push this).
  5. A token delimiter and a Base 64 encoded (4) are appended to (3) to create the token.

Here is an example login token (token delimiter is the | character):

1|1175181932|b7QDEolhC7kNUDjlAlZJ7xnKqSA=

Some notes on login tokens...

If you ever start digging through the documentation, or call up support with an automation or admin server login failure, the first thing you'll be asked is, "Are the times on the front and back end servers the same?" If the times are off by too much, the calculated expiration time of a login token will be wrong on the portal server and the login token will be invalid (expirations for login tokens are usually short in duration).

As far as whether or not the tokens can be reverse engineered by brute force, the answer is probably not (never say never, right?). The root key for token creation is 18 digits long, and the token itself is comprised using a well known hashing algorithm (if you're curious, you'll have to sniff it out yourself, since I'm not sure I'm allowed to give that kind of info out). Since I don't know of any practical weaknesses in the algorithm, brute force seems an unlikely avenue for success. And since the key to the hash is stored in the database, if a malicious user can actually retrieve this key, you probably have bigger problems to worry about than your login tokens being spoofed.

More Than 5 Seconds on Caching

| | Comments (0) | TrackBacks (0)

Back by popular demand: a more than 5 second overview of the portal's portlet caching mechanism. Apparently my last post, with it's 5 second explanation of caching, left some people wanting more. As such, I'm back with a more than 5 second explanation of portlets, cache keys, and a somewhat practical portlet you can use to demo some of ALUI's caching behavior.

This post is mostly a (hopefully more comprehensible) rehash of the caching edocs. Without further ado, let's get into it...

Setting Up a Demo Portlet

Before I get into it, I want to show you some simple source for a JSP I wrote to test this behavior. At some point this type of thing may come in handy if you get confused. All I am really doing here is comparing a javascript timestamp (rendered on the client every time) with a server side timestamp (if the HTML is cached, this timestamp won't change). This JSP will also allow me to change HTTP headers using some variation of response.setHeader.

<%@ page language="java" import="java.util.*" pageEncoding="UTF-8"%>
<%
	// get current time for manipulation
	Date theTime = new Date();
%>

<p>This portlet tests the caching framework.
<script>var dt = new Date();</script>
<br><br>Javascript time: <script>document.write(dt.toString());</script>
<br><br>Server side time: <%=theTime.toString()%>
</p>

Caching - Step 1 - HTTP Headers

The first lines of defense in the portal for portlet caching behavior are the HTTP headers sent in a portlet response. There are a few headers you should be concerned about, and all of them will override any other cache settings configured in the portal except for the minimum cache time. If you don't send any headers, then the cache will default to the web service settings (see below) and use public caching (there will be one copy of the portlet in the cache for all users).

Here are the three major ones (anything I leave out is in the edocs):

  • Expires - This is the simplest method for caching HTML in the portlet cache. Set this header to an HTTP date (i.e. Sun, 15 Jul 2007 16:00:00 GMT) and the portal will check for new content at the time specified (and cache it until then).
  • Last-Modified - This header specifies the last modified date of content. Rather than setting an Expires, you can have the cache check for new content every time a request is made, but rather than make a full request, the portal will only check the HTTP header. If and only if this header changes will the cache be refreshed.
  • Cache-Control - This may be the most important header of all, since it turns on and off per-user caching (set it to private to cache portlets on a per user basis -- see Step 3 for details) and allows us to turn off caching all together (set it to no-cache).

If you're still curious, and edocs adverse, you can find a great article on HTTP caching here.

Caching - Step 2 - Web Service Settings

The second lines of defense in our portlet caching framework are the web service settings configured for a portlet (don't worry about the advanced settings just yet). These settings simply tell the portal when to start making requests to refresh the cache, and when it must make a request to refresh the cache. Here's what they look like:

ws-cache-settings 

Remember, though, that the portal tries to respect the HTTP standard, so the HTTP headers we talked about in step 1 are still valid in between the maximum and minimum cache times. In fact, content might be cached longer than the maximum time if it uses the Expires header to indicate it is still valid.

Caching - Step 3 - Cache Keys

The third line of defense in portlet caching is the portlet cache key. This key indicates to the portal whether or not the data in the cache is bad. If the key changes, new content must be requested. If you want to think of it programmatically, the key is just a string that keeps getting appended with a bunch of data. Each time a portlet is put on a page, that string is built and associated with the portlet. If that string changes the next time a page is requested, the cache is invalid and a new copy of the HTML must be requested.

The key is built from two types of settings: those which are always part of the key, and those which are only part of the key if they are used in the portlet. How do we indicate whether or not they are used? Simple... when we choose to send them to the portlet via the web service settings, the portal automatically assumes we need to use them in the cache key. You can adjust these settings under the Advanced Settings option in the Web Service editor:

advanced-settings 

By default, if any of the following things change, the cache key will become invalid and new content will always be requested:

  • Any type of setting passed to the portlet: Portlet settings, User settings, Community settings, Administrative settings, Session preferences, User Information
  • Locale
  • Actual URL to the portlet
  • Community
  • Content Mode (actually, I have no clue what content mode is, but it's in the documentation)

Additionally, the following things will affect the key if we check the boxes above:

  • Time Zone
  • Page ID
  • Community Security (ACL)
  • Experience Definition
  • Portlet Alignment
  • Activity Rights sent to the portlet

But What About...?

There are a few confusing things involving caching I'd like to point out here, since they initially confused me:

  • 304 Redirects - What if you use a response.setRedirect or Response.Redirect? Will that screw up the caching in the portal? Nope... as long as the page you are redirecting to in the portlet doesn't change (remember our cache key is based on URI), the cache will cache whatever page you redirect to, instead of the redirect.
  • Javascript and Images - These are not cached for the portlet, rather they are cached by the gateway, which basically means they just follow whatever rules your browser would normally follow for the web server (if you set an expires header for static content in apache, the gateway will respect that - if not, then it will re-request the content).
  • AJAX - Any dynamic javascript requests will also lose the portlet based caching mechanism. The reason for this should be fairly obvious: the portal can't track extraneous requests on a per-portlet basis. Again, these will go through the gateway and your browser will be responsible for handling any HTTP headers.
  • Default Behavior - If no headers are specified, the cache will respect the web service settings and use public caching (not per user).

That's it. As always, I guarantee something in here is inaccurate. Have a great time messing around with portlet cache settings, but remember -- use them responsibly.

Greetings again from the vaguely abstract and always entertaining blogosphere. Since my last post, an attempt to lay out the basic workings of the EDK, I've been thinking it might be helpful to clarify some of the page construction methods in the portal. One of the frequent questions buzzing around in my head during my early days with the product was "How exactly does all of this page rendering stuff happen?"

First, you should note there is a great explanation of the caching mechanisms the portal uses for portlets in the edocs. The only problem is, the explanation is long, technical, and doesn't cover any of the other rendering mechanisms you are likely to encounter along the way. Thus, as a result of my laziness (see previous posts for details), the only things I have learned about portal caching and page construction have come from doing it the wrong way and suffering as a consequence. Hopefully some of you will benefit from my pain.

Second, this is my understanding of the rending process, based on what I've seen and done with the portal. No one sneaked any blueprints into my office (oh wait, I don't have an office), so even though I'm pretty confident in this stuff, you'll have to excuse any possible technical inaccuracies.

The Whole Enchilada

Let me start with a diagram of my understanding of the page rendering processing. Please note that I am not the Visio ninja some of my colleagues are...

portal_rendering 

What's going on here? As you can see, the above process is a bit convoluted. In an attempt to clarify, here is a step-by-step description of the page rendering process:

  1. User's browser makes an HTTP request to the portal host (IIS or a java application server)
  2. Portal host handles the request by parsing the URL and deciphering what portlets and/or server API code must be rendered on a page. Server API code could be the base portal header/footer/top bar, or it could be an intrinsic portlet.
  3. Portal host makes requests to the portlet cache for any portlets on the page. If the portlets are in the cache, and the cache entries are still valid (see below for a 5 second caching review, or the edoc section I mentioned earlier), cached HTML is pulled from the cache. Otherwise, requests are made (in parallel) to the various remote URL's to retrieve HTML content.
  4. Once all of the portlet content has been retrieved, it is assembled into a complete page. No server API code has been rendered at this point.
  5. The partial page compiled from all the remote portlets on the page is next run through a parser, which looks for any <pt> tags and renders them (note this only happens after the page is built).
  6. Finally, the rest of the server API code on the page is executed and the page is compiled into its completed from, then returned to the user in an HTTP response.

From the above diagram and accompanying description, you will hopefully note a few things:

  • The portlet cache only applies to remote portlets. Anything else is rendered at every page request, and must implement its own cache.
  • The tag framework, any intrinsic portlet, and any custom navigation scheme all rely on the same rendering method: the server API, hence they must always exist as libraries on the front-end portal server.
  • I am lumping all of ALUI's embedded application servers (Publisher, Collaboration, etc) into Remote Portlet Hosts because that is all they really are (I will address the frequently confusing pcs tags from Publisher later in this post).

So what's the main point? The main point is that if you are going to be developing a custom tag, intrinsic portlet, or custom navigation scheme you need to be hyper-sensitive to performance. Unless it is some sort of administrative feature, your server API code could wind up rendering on the home page of every user in your portal (or even on every page in the portal).

The other main point is that you can't use the tag framework unless you are writing a remote portlet.

5 Seconds on Caching

I don't want to beat a dead horse, since the portal's caching mechanisms are explained very well in the link I provided, but I think it might be worth my time to provide a 5 second summary of that document. In fact, what most people want to know about how caching works is summarized there, albeit at the end of the document, so I'll just repeat it here:

  • The Portal Server never makes a request to the remote server before the Minimum Cache Time if there is content in the cache. (In version 6.0, the portlet cache is limited to 15 minutes, so a request will always be made after 15 minutes.) Multiple requests made for the same portlet with identical cache keys within this minimum time always receive cached content. As noted earlier, setting the Cache-Control header to “no-cache” overrides editor caching settings; content will not be cached.

  • The Portal Server always makes a request to the remote server after the Maximum Cache Time. Cached content might or might not be returned, based on other information (i.e., the Last-Modified header).

  • The Portal Server might or might not make a request to the remote server if content has been cached in between the Minimum and Maximum Cache Time. The Portal Server observes programmatic caching (i.e., the Expires header) in the window between the minimum and maximum times.

If you've done any portlet development or configuration, the above should make quite a bit of sense to you.

What about Publisher?

The final, and often misunderstood, piece to this puzzle is Publisher. A lot of times customers want to know where, when and how Publisher's own rendering framework fits into the portal. And actually, it's pretty simple once you know how it works.

The best way to think of Publisher is to think of it as two pieces: a stand-alone static content host (static HTML and images only), and a publishing product. When a piece of content is published in the publishing product, that is the only time when anything intricate or cool happens. The rest of the time (for instance, when a Publisher portlet is put on a page), Publisher is just serving up a static HTML file with some images or javascript. That's it (well, except for the redirect part, but I'm trying to make a point).

As I mentioned previously, people often confuse the when of Publisher's custom tag rendering scheme (<pcs> tags, like those found in the News Portlet). Those tags are only rendered once: when a document is published. After that, they have been turned into static content for portlets to request. Thus, when I talk about the portal's rendering process, these tags become a moot point. They are never rendered at run time.

Hopefully this clears up some of the confusion.

Something Helpful

At this point I'm hoping you have a complete picture of the rendering process. If not, I'll settle for a partial understanding with a false sense of confidence.

As I have said in a few of these posts, this information is smattered about the documentation, source code, and in some cases, exists only in the late night conversations between consultants who are too obsessed with work to make small talk about sports and the weather (Keith, when are you going to start blogging?). My goal is to give you something in these posts that at some point or another I wished I had read instead of experienced.

How does the IDK work?

| | Comments (0) | TrackBacks (0)

Last time I promised a more in-depth technical post than my previous ruminations on Ensemble. I'm not one to go back on my word, so I give you my latest airport-delay-inspired technical breakdown... an analysis of the Aqualogic IDK. The point of this exercise will be to show you both how I initially broke down the IDK, and how it functions under the covers. This should, at the very least, make you a hit at most cocktail parties.

What is the IDK?

The IDK/EDK, or whatever else we're calling it these days, is essentially a two headed monster: a method for retrieving information from the portal when a custom application renders a portlet (the IDK portlet API), and an object model for manipulating the portal and its associated services (the PRC or Remote API). In other words, the portlet API is bunch of information about the current state of the portal, and the PRC is a way to change the portal programmatically.

At the end of the day, even though I've broken them down, both of these pieces ride under the auspices of the IDK. For the first part of this article, I'm going to be delve into the portlet portion of the IDK, after which I'll work my way into the operation of the PRC.

How does the IDK work?

In a nutshell, whether you are developing a portlet in .NET or Java, the IDK is simply a set of libraries you can include in your web project. They are either .NET dll's or jar files which use the com.plumtree namespace to expose portal features.

Under the covers, the IDK's operation is actually pretty simple. The IDK portlet API is a set of objects instantiated inside a .NET or Java based page. The objects themselves are constructed using the factory pattern, and are really just wrappers for a typical .NET or Java HTTP request/response object.

The second portion of the IDK (the PRC) is a set of object mappings to web services which are hosted on the portal API server. This is the portion of the product you install which usually has a URL similar to http://localhost:11904/ptapi/services/QueryInterfaceAPI. This URL can be found in the Administrative Utilities | Portal Settings | Portal URL Manager | Soap Server Url portion of the portal. Later, I'll explain a bit more about its configuration and use. Before I get there, let's take a step back and look at how the IDK works under the covers...

Creating a portlet to capture HTTP Headers

One of the first things I did way back when I wanted to see what was going on with the portal, was to create a portlet to display HTTP headers. The IDK itself pretty much uses HTTP headers as its information transfer mechanism, so if you want to see what's really going on, this is a good place to start. For those familiar with web development, the exercise should be pretty elementary. I'm going to go with a .NET example for this particular post. Below, you'll find source for the Page_Load method of my Print.aspx page, which is basically just a header dump to the HTTP response:

private void Page_Load(object sender, System.EventArgs e)
{
	Response.Write("<p><br><i>Last Update: " + DateTime.Now + "</i></p>\r\n");
	Response.Write("<hr>");

	Response.Write("<p><b>HTTP Headers</b></p>\r\n");
	Response.Write("<table style="margin: 6px" cellspacing="1" cellpadding="2" border="1">");
	foreach (string key in this.Request.Headers.AllKeys)
	{
		Response.Write("<tbody><tr>");
		Response.Write(
			"<td style="padding-right: 15px; padding-left: 5px; font-style: italic">"
 + key + "</td><td style="padding-right: 15px; padding-left: 5px">" + 
this.Request.Headers.Get(key) + "</td></tr>\r\n");
	}
	Response.Write("</tbody></table>\r\n");
}

I don't really want to get into portlet setup in this post, since this is an elementary step in the development process, and is covered pretty extensively in the edocs. Basically, though, I created a Remote Server, Web Service and Portlet to point to the Print.aspx page on my laptop. After that I simply added it to a My Page and took a look at the results:

HTTP Headers

Accept text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Charset ISO-8859-1,utf-8;q=0.7,*;q=0.7
Accept-Encoding gzip
Accept-Language en-us
Host localhost:80
Referer http://localhost/portal/server.pt?in_hi_userid=1&space=MyPage&
parentid=0&cached=false&control=SetPage&PageID=0&parentname=Login
User-Agent Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4
CSP-Protocol-Version 1.3
CSP-Can-Set Gadget-User,User
CSP-Gateway-Specific-Config PT-User-Name=Administrator,PT-User-ID=1,
PT-Stylesheet-URI=http://localhost/imageserver/plumtree/common/public/css/mainstyle2-en.css,
PT-Hostpage-URI=http%u003A%u002F%u002Flocalhost%u002Fportal%u002Fserver
%u002Ept%u003Fopen%u003Dspace%u0026name
%u003DMyPage%u0026id%u003D2%u0026cached%u003Dtrue%u0026in%u005Fhi%u005Fuserid%u003D1%u0026control%u003D
SetPage%u0026PageID%u003D222%u0026,
PT-Community-ID=0,PT-Gadget-ID=255,PT-Gateway-Version=2.5,
PT-Content-Mode=1,PT-Return-URI=http://localhost/portal/server.pt/gateway/PTARGS_16_0_0_0_43/a?b=c&
,PT-Time-Zone=America%u002FNew%u005FYork,
PT-Imageserver-URI=http://localhost/imageserver/,PT-User-Charset=UTF-8,PT-Page-ID=222,
PT-SOAP-API-URI=http://localhost:11904/ptapi/services/QueryInterfaceAPI,
PT-Portal-UUID={5eaaa679853b9ecc-10b9583e62f0},PT-Class-ID=43,PT-Guest-User=0
CSP-Aggregation-Mode Multiple
CSP-Gateway-Type Plumtree

There is plenty I could say about the information passing mechanisms between the portal and its remote portlets, but I am going to try to keep my analysis in the context of the IDK. Other details can be left to future posts, or the minds of my readers.

Analyzing IDK Headers

You'll notice a couple of things here, most important of which is the CSP-Gateway-Specific-Config. This header contains most of the settings which are sent to a portlet via the portal. In fact, when you instantiate an instance of the IDK in your portlet, the method calls which retrieve information will usually provide it via this header. Does this mean you could potentially read the information directly out of the header? Of course. Would it be supported by BEA? Of course not. I'd be willing to bet Chris Bucchere is doing something similar with bdg's PHP and Ruby IDKs.

The next thing you can do, to view more information, is to enable advanced features of a web service and see what happens to your portlet headers. For instance, when I turn on Send Login Token to Portlets under the Advanced Settings portion of the Web Service Settings, I see a new header called CSP-Session-Token, which contains an odd sort of token value (which is a Login Token).

In fact, both the login token I mention above, and some of the information in the CSP-Gateway-Specific-Config can be used to manipulate a more advanced feature of the IDK, the PRC. If you're only interested in the IDK portlet API, feel free to tune out here. Otherwise, keep reading...

How does the PRC Work?

When you or your administrator initially configured the portal (install/setup), one of the components which was installed was the portal API server (SOAP server, IDK host, whatever else it may be called). This server is actually just a web services based host for the PRC. The server itself abstracts back-end methods for connecting up to any of the various ALUI products you can manipulate via the IDK (Collaboration, Publisher), as well as a sort of back-end connection for the portal itself.

The server URL is actually configured as I enumerated above, in the administrative settings for the portal. The PRC can then be instantiated in one of two ways:

  1. Explicitly starting up a session using your favorite development environment and and the RemoteSessionFactory.getExplicitLoginContext method (see the edocs again)
  2. Instantiating a PRC session implicitly, inside a portlet

Creating an explicit session can actually be done from a console application, thick client, or even Windows service. All that is required from any IDK program utilizing the PRC is that it has connectivity to the IDK URL mentioned under the What is the IDK? section of this post.

IDK Authentication

If you utilize the IDK inside your portlets often enough, you'll notice certain methods will throw an exception if you don't configure your web services to pass a login token to your remote portlet (see above). You may also notice these are part of the com.plumtree.prc namespace. What gives?

What's happening is the portal is actually re-authenticating a user when the the PRC is first instantiated. Want to manipulate user/portlet/web service information? Guess what... the portal has to know who you are, and knowing that you're a portlet just isn't good enough (otherwise, anyone could manipulate the portal back end via the API web service).

That's where login tokens come in. They are used to tell the portal your user ID and session expiration time. They're generated by the portal itself, then passed to a remote portlet (you must explicitly tell the portal to pass a login token by configuring a web service's advanced settings). When you execute a method in the PRC, it decodes the login token your portlet passes to it (via the IDK) and makes sure you are authorized to execute that method.

Of course, this adds the extra step of authentication and session instantiation. What this means to you, the portal developer, is that calls become more expensive if you use the PRC.

Performance Considerations

The IDK portlet API (the part of a portlet where you are just requesting information about the user, portlet and environment) requires little overhead from the portal, since as we saw in the header analysis, all of this information is provided to a portlet without any kind of round trip to the API server. Thus, if you expect heavy traffic in your portlets, you would be well advised to stick to header information only (this is where looking at HTTP headers can sometimes pay dividends).

If you want to use the PRC, you should be aware that a separate session is going to be instantiated with the IDK, a session which comes with all of the same baggage as logging into the portal (user login, authentication, etc). Not only that, but if you decide to use a method which is non-native to the portal (anything which uses the search server, collaboration, publisher, etc), then a separate call must be made from the IDK host to the service you are manipulating. Since these services aren't always on the same box as the API server, you could be in for some additional transport overhead.

For instance, let's say you write a portlet that uses the PRC to perform an administrative search for portal users. A typical use case might look something like this:

  1. User hits page in portal, request gets sent to your portlet.
  2. Your portlet reads HTTP header information to display the user's name back to them in the portal.
  3. User types a name to search for and hits the 'Search' button.
  4. Your portlet reads the HTTP header to retrieve a login token.
  5. Your portlet reads the HTTP header to retrieve the API server URL.
  6. Your portlet sends a request to the API server URL to log in using the token.
  7. The API server decides whether or not the user who hit your portlet can log in.
  8. Your portlet sends a request to the API server to search for a user with the string that was entered in your search box.
  9. The API server decides whether or not the user of your portlet can perform administrative searches.
  10. The API server sends a search request to the search server using the string passed in.
  11. The search server returns the search results to the API server.
  12. The API server returns the search results to your portlet.
  13. Your portlet returns the search results to the portal.
  14. The portal returns your portlet HTML to the user as part of a larger page.

As you can see, there are a lot back-end calls above that are completely abstracted when you utilize the IDK. If you're not aware of what's going on, you can get yourself in a lot of performance related trouble.

The Final Word...

Some of the information I enumerated above is covered pretty extensively in the documentation (see links above), but it's not always clear what is happening or why it is happening. Hopefully, this post will give you a clearer understanding of the under the covers mechanisms that power the IDK, and serve as a useful supplement to the IDK documentation. As always, I am sure there are things I've left out or missed that will be covered in the comments.

About this Archive

This page is a archive of recent entries written by hross in July 2007.

hross: June 2007 is the previous archive.

hross: August 2007 is the next archive.

Blogroll


Integryst

Function1

Fabien Sanglier

Bill Benac

Jordan Rose

Chris Bucchere

Robert Herrera

Nanek Blog Aggregator

Spartan Java




if you'd like to be listed here.




I don't blog about non-tech issues here, but you can check my Google Reader Shared Items if you want to know what I'm currently interested in.

Categories