Submit Your Site For Free!

Email Address:
* URL:
*
*Indicates Mandatory Field

Terms & Conditions

DevWebProCA
FlashNewz
DevWebPro








Reporting API For SiteCatalyst Released

By Gary Angel
Expert Author
Article Date: 2009-03-23

At last year's Omniture Summit - yes the 2008 version - one of the most interesting announcements was the release of a Reporting API for SiteCatalyst. Like red wine, technology announcements generally require a certain amount of time to age into a respectable level of maturity and the Omniture Reporting API was no exception.

But late last year, after playing on-and-off with the API, we at Semphonic decided it had reached a level of maturity where it could be successfully used.  It's early days - the API is still beta and its deployment model still a little fuzzy - but you can work with it reliably and produce interesting products.  In my first blog on this topic, I provided an overview of what an API is and some basic information on the Omniture APIs. In the second and third posts, I took up the topic of the Token System (which controls how much you can use the API and how expensive it is). In today's post, I'm going to talk about the basic elements of an API application. I'm going to wrap up next week with some thoughts on when using the API is likely to be a good option for your business.

Today's post is going to be moderately technical - but I'm going to try and keep the discussion appropriate for non-programmers. What I want to do is give you a feel for how the API works and what it can accomplish.

First, it's important to realize that the Omniture API is really six or seven different APIs. The oldest, and in some ways least developed is the Data Insertion API. This API is really just a method of sending XML post requests to Omniture in lieu of firing a tag. It can be pretty handy in certain circumstances, but it's really not even appropriate to call it an API. The other APIs fall into three basic buckets: administration, job submittal, and reporting.

The administration API is quite full featured and allows you to both access and update pretty much any administrative setting that is available in SiteCatalyst. Why would this be useful? Well, if you are working with a large number of report suites, it can be time-consuming to check or set an attribute on many tens, hundreds or even thousands of report suites. If you need to do that, it's probably worth investing the effort into an application that will do it for you.

The job submittal APIs are interfaces into the data warehouse and SAINT. They provide a mechanism for you to create and check on data warehouse requests or to create and submit SAINT requests. These job submittal APIs are very simple - often having only a few methods and relatively little potential for customization.

The reporting APIs are the richest and most interesting. There is the base Reporting API (which is what I'm going to talk about today) and there is another, fairly similar, API for Discover. These APIs provide access to most of the same data you'd be able to access - respectively - with SiteCatalyst or Discover.

Using the Reporting API typically involves six major steps:

Apiblogimage1

I'm going to quickly cover each.

Authentication is the process by which you tell Omniture via the API who you are and prove that you have a right to access the information. You don't use your standard login to authenticate with the API. You have to be setup in SiteCatalyst as an API user and you must use SiteCatalyst to generate a SecretWord that you will attach to your requests. There's nothing particularly unusual or difficult about this. A word of warning, however. The authentication model Omniture has chosen is quite a bit more complex than many other web service APIs and is not well supported in Windows. So if you are a Windows shop (which Semphonic is), instead of trying to figure it out yourself, I strongly urge you to use the Sample Code that is provided on the Developer Web Site. I'd recommend the Daily Time Parting App as a good starting point for authentication code.

Request Definition is the next step and it's one of the most important. This is a place where it really helps to have some good SiteCatalyst knowledge to complement your programming staff. The Report Definition phase works very much like SiteCatalyst (almost identically when it comes to understanding what metrics are available, how trended reports work, stuff like that) - but instead of having multiple types of reports (one for paid search, one for natural search, one for referring sites, etc.), there are only a few basic types (ranked, trended) which you flesh out by specifying which metrics you want. It's really a nice way to manage the system, and it makes report definition pretty straightforward.

Here's a tiny snippet of Sample Code that will get the visit count for a Paid Search Report:
// Start Report Definition
           reportDescription rd = new reportDescription();
// Set the Date Range
            rd.dateFrom = oDate.StartDate;
            rd.dateTo = oDate.StopDate;
// Set the Report Suite
            rd.reportSuiteID = reportSuiteID;

// Create a list of reporting elements
   reportDefinitionElement[] rdDefElements = new reportDefinitionElement[1];
// Set what type of report we want - in this case searchenginePaid
            rdDefElements[0] = new reportDefinitionElement();
            rdDefElements[0].id = "searchEnginePaid";
// Set how many rows we want returned and where to start
            rdDefElements[0].top = 200;
            rdDefElements[0].startingWith = 1;
// Now set what we want back
            reportDefinitionMetric[] metrics = new reportDefinitionMetric[1];
            metrics[0] = new reportDefinitionMetric();
// In this case, visits
            metrics[0].id = "visits";
            rd.metrics = metrics;
// Add the list of elements to the report
            rd.elements = rdDefElements;


That's pretty much it. This will request the top 200 elements in the Paid Search Engine report and return a visit count for each.

Once you've built a report definition, you send it to Omniture. You have a choice of methods here depending on whether you want to just wait for the report or you want to queue it. Normally, you'll always want to queue reports so you don't completely lock up the interface. And requests can take a bit of time to process. On the whole, it's not much different (though a bit slower) than doing the same actions in SiteCatalyst.

Here's a sample:
   
//request the report

            reportQueueResponse response = ws.ReportQueueRanked(rd);

Once you've submitted a request, you need wait and periodically check for it to finish. I wrote about a big gothcha in this process in my last post - and I'll just leave you to check out the code samples given there.

Once the report is ready, you get the returned results. It's really just one line of code that gets the results for a specific report (based on the reportID). Remember not to call this request until you've made sure that the report is actually ready by checking its status:
reportR = ws.ReportGetReport(response.reportID);


Finally, you can loop through the returned data and process it or save it as you desire. Here's a bit of sample code that shows this process for the Paid Search visit count report defined above:

// We are going to loop through the returned data and process every row
            for (int i = 0; i < reportR.report.data.Length; i++)
            {
// This field will contain the dimension name - here the Paid Search Engine
                metricval = reportR.report.data[i].name;
// This field will contain the metric - here the visit count
                visitCount = reportR.report.data[i].counts[0];
// Here we save these values in our own array for processing
                rS.addValue(metricval, (int)visitCount);
            }

Once you've gotten all the results, you can do whatever else you need to do with them. One of the big advantages of an API application kicks in at this point. You can start to really apply matching and logic to the results you get back. Using the Excel Integration, we often build reports that match the key dimension (like referring site or search keyword) to it's values over time. But while Excel has functions that facilitate this, they are a bit clumsy and can't handle any fancy wildcard matching and don't let you do much logic around what to do with unmatched or multi-matched records. In a program, you can handle such issues easily - making it possible to produce much more robust reports.

You can also add in additional information from other sources without having any manual steps. And, of course, you can do fancier logic than would be practical in Excel.

At Semphonic, we've been working on taking our Excel Traffic Models and using the API to source them. The Traffic Model looks at many different possible sources of traffic (referrals, direct repeats, direct new, PPC, SEO, campaigns, etc.) and analyzes how they are changing over time.

One of the really nice aspects of using the API is that the model can incorporate predictive models (the red line in the chart below which is based on seasonalality) and measures of statistical significance. Note that two of the identified changes in the model are tagged as not being significant given the level of variation observed in that particular measure:
 
Apiblogimage2

We call this style of reporting Analytic reporting because it is designed to build our analytic models directly into the reporting. The idea is to make sure that the report consumer is only shown what's important and is protected from making many of the interpretation mistakes that are all-too-common in web analytics.

The API is a great vehicle for this type of reporting. By saving all of the information taken from each request, the model is instantly displayed when the time period has been run previously. And since previous months are saved, only new API requests are issued. This also makes it faster than most of our Excel models where we usually end up just re-generating the entire report. And, of course, the level of potential customization is much higher. In addition, the model can be seamlessly pointed at different report suites. You don't have to redo your Excel requests for each site.

Naturally, this comes with a price. It's still quite a bit harder for us to customize an API application than an Excel worksheet. Over time, as we build up a library of code, I expect that difference to narrow. And the API is not free - which hearkens back to the issues in my previous post on the Token system. But the opportunity for going creative and powerful applications is definitely there.
In my next (and final) post on the API, I'm going to try and give you a sense of when it would be appropriate to use the API and how much work you should expect it to be.

Comments

About the Author:
Gary Angel is the author of the "SEMAngel blog - Web Analytics and Search Engine Marketing practices and perspectives from a 10-year experienced guru.




Newsletter Archive | Article Archive | Submit Article | Advertising Information | About Us | Contact

DevWebProCA is an iEntry, Inc.® publication - 1998-2009 All Rights Reserved Privacy Policy and Legal