The dangers of ROI based on low conversion rates

You have a large ecommerce website.You want to make small incremental improvements to the performance of the website. You can measure the impact via an increase in profits. Everything sounds pretty simple. Just run small experiments on everything from the user experience, pricing, pay-per-click ads etc. when you see something working do more of it. If things aren’t working then try something else.

This is age-old marketing know-how. I’ve seen this approach being used in direct-marketing since the start of my career. This is the beauty of digital. We can measure everything. Not like stodgy old media. But are these assumptions true?

Lets consider a simple model. The experiment could be anything from a new online ad campaign, an A/B test around button positioning or a good old fashioned bit of discounting. For the purpose of this discussion it doesn’t matter. We have a large customer base. We measure success based on a influencing the customers behaviour. We can expect a very low conversion rate. We also have a low cost total cost for the experiment.

We begin with a big cohort of customers. We then split these out into those who we were able to positively influence, and those who we didn’t and had a negative effect on. The second group were never going to buy, or were going to buy anyway. In each group we then consider the accuracy of our measurements, are the results we measure true or false.

This gets confusing really quick. So please stay with me.

When calculating our ROI the measurement we need to get a count of all the positives. This count is made up of two types. The true positives (i.e. people we correctly measure as being influenced by our actions) and false positives (i.e. people who weren’t influenced but because of inaccuracy in the measurement methods we think we’re).

Let’s assume we have a cohort of 100,000 customers. We have a 1% error rate in measuring false positives (people who weren’t actually influenced). Let’s also assume the true influence rate is 5%.

  • True Positives = 100,000 x 5% x 99% = 4950
  • False Negatives = 100,000 x 5% x 1% = 50
  • True Negatives = 100,000 x 95% x 99% = 94050
  • False Positives = 100,000 x 95% x 1% = 950

So our test results give the following.

  • Positives = 5,900 = 18% error
  • Negatives = 94100 = 1% error (as expected)

This is pretty worrying. We could easily be making a decision based on an ROI of 20%, while actually with a small error rate of just 1% that results are break-even.

False-Positive Paradox

Let’s consider some real world examples and some possible strategies for avoiding this effect (known as the False-Positive Paradox).

The first is a pay-per-click campaign. So here we just pay for clicks. Tracking purchases is pretty straight-forward most analysis tools give you revenue figures. However it is going to be pretty hard to measure definite cause and effect here unless we adopt a more scientific approach. Ideally we would have a pre-defined cohort of users to whom we show the advert, then we can measure real influence by comparing users who find the site organically vs. those clicking on the ad. Given most reporting tools don’t do this I’d argue the error rate here is much higher than our illustration of 1%. Ideally use cohorts, if not ensure you’re ROI barrier is raised high enough to lift you out of danger.

Next let us consider A/B testing of a new design. In this case we are running an experiment using a typical javascript based tool. We are looking for justification for doing some change to the platform. So it is the cost of the proposed changes to the platform we need to consider. Now in this example we can expect a little more scientific approach from the start. We are running the test in parallel which will remove a lot of noise from the results (for example a sale starting during the test period will have the same effect on both groups). However unlike the campaign example here are measurement is not based on absolute sales. We’re looking at a shift in buying patterns. The customers who didn’t buy something with the old design (A) who will now buy something with the new design (B). So unless you have a landslide victory with the new design take care.

The final example is implementing personalisation logic. In this case we are segmenting our customer base and to a certain group showing some different content. Again if carried out using A/B logic the results are more scientific. However in analysis of these kind of rules generally will only show the sales figures of the segmented group of users and any uplift seen in this group against the norm. If the rate of ‘influence’ is low then we can expect errors. To avoid this case personalisation rules should again lead to much higher influence rates. In a word keep it simple, creating multiple highly targeted rules based on non-cohort based analysis may be unwise.

References

Posted in Complexity | Leave a comment

The Roll the Dice Once Fallacy

It was Stephen Hawking who noted,

The laws of science, as we know them at present, contain many fundamental numbers, like the size of the electric charge of the electron and the ratio of the masses of the proton and the electron. … The remarkable fact is that the values of these numbers seem to have been very finely adjusted to make possible the development of life.

What concerns me is the claims people make based on these facts. Whether it is intelligent design, the hard Anthropic Principle or any claim about the likelihood of life as we see it coming into being.

All suffer from a very simple logical flaw known as the Selection Bias. When any discussion about our place in the universe starts using extremely large or small numbers to make a point then you see this bias in action. The truth is we only have one data sample with which to make such claims. That sample of data has intelligent life emerge, on one planet, with one species skilled enough to make suitable observations. We simply know nothing about the situations where life didn’t emerge, or where it emerges differently. Any argument that uses probability with only one data point is flawed.

Imagine trying to make claims about the behaviour of a simple 6 sided-dice. If you have been able to throw the dice just once. For arguments sake on that one throw you get a 6. We’d be told the chances of rolling a six are small, just 1 in 6. Behold the more we examine the dice the more it is clear the only outcome we observe is a 6. We’d analysis the video of the dice bouncing and tumbling across the table. The improbable events that led to our seeing a six would be discussed at length. External forces would be called into question. Some would say the dice was loaded.

Of course eventually there would be calls to roll the dice again, but with life we can’t do that. We can’t change the starting conditions and see what happens. Of course we can speculate, but in the same way we’d try and model the tumbling dice, the smallest changes in parameters will have a large effect to the outcome.

So when you see an argument based on probability and our place in the universe call bullshit. We have only been able to roll the dice once.

Posted in Complexity | Leave a comment

Antifragile: Measuring Complexity

This post forms part of series I’m doing following the 2 day course I attended in Boston. Yaneer Bar-Yam was speaking along side Nassim N. Taleb.

It might be easier to think about the simple first. What makes something simple? A square is simple. It has 4 straight sides, all the same length and at right angles to each other. Perhaps not the most concise description but you get the idea. Simple things don’t need many words to explain them.

Yaneer asserts that it is the description of an object that can be used as a measure of its complexity. There is a study that shows roughly speaking, the number of bits of information held in a sentence is equal to the number of bits required to store this information (once you take into account vocabulary and grammar). In science these kind of order of magnitude comparisons and easily made. Some may find the ambiguity worrying.

It also turns out that the media used for the description is not important. A photo or video is just as good and the information total can be calculated in the same way. This makes even more sense if compression algorithms have done their magic. I guess any data can be used to provide the description. There are clear parallels here with information theory and the idea of Shannon Entropy. A few times in the course entropy was used however because this is generally accepted to be an absolute measure based on the state of the atoms in the system it isn’t general enough for this discussion.

So when describing the state of a system, the description needs to be long enough to distinguish between all possible states.  If you have a simple system like a traffic light, the state can be expressed simply as red, green etc. Something with more possible states, like a given state of play in a game of chess will require more data. This is fine and fits well with Shannon Entropy.

What Yaneer does next is for me a bit of genius. Like Einstein who moved away from absolute measures and made things relative to the observe Yaneer does the same for the description of the system. This now means the perspective you take when looking at a system will change the length of the description required. Yaneer uses the term scale to describe this effect. Complexity of a system isn’t an absolute measure.

For example let us consider a gas. Now at one level the information required to describe the gas will be equal to its classic entropy and will be based on the speed and velocity of all the particles, high entropy, lots of data required to describe the system and so very complex. It is hard to predict where any given particle will be. However if we move away from the micro towards the macro state of the gas then it becomes simple. We have a few variables such as density, temperature and pressure required to describe the gas at this scale. It is also easy to predict what will happen over time.

The purpose of this post and the course itself is to consider how complexity forces us to reconsider how we work with organisations of people. Consider a Roman attack formation. At the small scale this looks complex and would take some discipline and training to work correctly however at the large scale it makes things simpler for the commanding officers. It is possible for one person to direct and control the outcome of a battle.

Contrast this with the way the Viet Cong fought a guerrilla war against the overwhelming firepower of the US and South Vietnamese forces. In that case the same tactics that had worked since Roman times failed catastrophically. The hypotheses given by Yaneer is that it was the complexity on the ground that meant it was not possible for a small group of commanders to give orders that could possibly result in success. When using a command and control structure the complexity of the problem you can solve is limited to the complexity the person in control can handle.

If you want to read more on this topic then Yaneer’s book “Making Things Work” is a great read.

Posted in Agile, Antifragile, Complexity | 1 Comment

Antifragility: A User’s Manual

I’ve just returned from a two-day program for senior management run by the NECSI as part of the Executive Education programme.

The program’s strapline was ‘Learn to thrive in a volatile and complex world by creating Antifragile organizations that thrive on stress and disorder’ or put slightly differently ‘When strong winds blow, don’t build walls, but rather windmills: there is a way to turn every bit of adversity into fuel for improvement.’

The 2 days were hosted by Nassim Nicholas Taleb and Yaneer Bar-Yam.

These posts are my take on the ideas presented. There are in no way an accurate or complete and where possible I’m using examples from my own context rather than the examples given by Nassim or Yaneer.

Concepts

Strategies

  • Bar Bell
  • Evolution
  • Small is Beautiful
  • Skin in the game
Posted in Antifragile, Complexity | Leave a comment

Antifragile: Black Swans

The term black swan was originally used to describe something as impossible. Back then no-one had seen such a beast. Later when one was found in Australia the meaning changed. It was then used to characterise the thing that once thought impossible is now known to be true.

Taleb defines a Black Swan event as having three attributes:

First, it is an outlier, as it lies outside the realm of regular expectations, because nothing in the past can convincingly point to its possibility.

Second, it carries an extreme ‘impact’.

Third, in spite of its outlier status, human nature makes us concoct explanations for its occurrence after the fact, making it explainable and predictable.

So why is this important? The human nature bit can be explained by the Hindsight bias it is the underlying assumptions we make about probability that lead to a different perspective on the world.

Our intuitive understanding of probability is generally based on the normal distribution (aka. Gaussian distribution or Bell Curve). Let’s take the height of people as the example. We generally expect to see the height of people to cluster around some average.

The chances of seeing something a long way from the average drops very quickly (based on something called the standard deviation). In our example of height the chances of seeing someone who is 5x the average height is pretty much zero. You don’t need to see many people to make a good guess at the average. If the normal distribution where the only probability distribution out there in the real world then we could simply position any extreme event at the edge of the curve and relate the possibility of the event back to the average (which is easily to calculate from a few observations).

It is clear then, that the Black Swan events don’t follow a normal distribution. They have a distribution where the most commonly occurring events can be easily mistaken for the normal distribution, but where extreme events can occur much further away from the average. So this time let’s consider the size of meteorites. Unlike people meteorites don’t have an average size. They follow a power distribution. Taleb called them thick tail distributions (making a point long-tail wasn’t correct). It doesn’t take much imagination to put large meteorite strikes in the Black Swan category.

Taleb then went on to characterise the two opposing environments as Mediocristan and Extremistan. Mediocristan being the normal, bell curve and Extremistan being the world of Black Swans, power laws and thick tails. Read the book to get more on these.

A good example is to look at salaries vs. heights. Say we take 100 random people from the planets population and calculate an average height and average salary. If we find the tallest person in the world and bring them into the room, the average will go up a couple of percentage points at worst. If we then invite Bill Gates into the room we can see that the average salary will go ‘through the roof’. Height belongs to Mediocristan and salary to Extremistan.

So what does this mean for IT projects? Let’s consider a standard IT task. Now depending on your project management methodology you may call this a user story or some form of Kanban card, for the purpose of this exercise it really doesn’t matter.

The size of this unit of work is based on either an estimate (using a developer’s experience) or some historical data (burndown chart or cycle time) again it doesn’t matter which. This step makes the crucial assumption that the estimated time vs. the actual time will vary based on a normal distribution. We assume it is possible to calculate an average from a small number of previous examples of similar work.

However we’re not working with standard tasks. We’re not making the same widget time and time again. This is knowledge work. Now I’m sure there is some software development tasks that belong in Mediocristan, however my opinion is that these are exactly the kind of tasks you can automate or otherwise shift away from your development team.

So we’re very much living in Extremistan, and here we leave ourselves exposed to Black Swans. Averages don’t work. Impact on project timeline is massive. You can talk about continuous improvement, flow and incremental experiments but if the data you are using to drive these decisions is sitting there with a big thick tail of outliers then you’re doomed I’ve even seen people recommend the outliers are removed to get a better “fit” for the data.

So what does a Black Swan perspective on managing a project mean for some of the current best practice?

  1. Measures such as velocity will do a very poor job of predicting the future.
  2. Spending time in retrospectives on the cause of Black Swans is largely a waste of effort (making something that is random explainable and predictable is human nature).
  3. Games that illustrate how Scrum/Kanban work using a simple Mediocristan example only help fool us further (packing envelopes or making coffee etc.).

The impact of this insight I will explore further in future posts but for now I’ll leave you with the words of Tom DeMarco

Consistency and predictability are still desirable, but they haven’t ever been the most important things. For the past 40 years, for example, we’ve tortured ourselves over our inability to finish a software project on time and on budget. But as I hinted earlier, this never should have been the supreme goal. The more important goal is transformation, creating software that changes the world or that transforms a company or how it does business.

Posted in Agile, Antifragile, Complexity | 1 Comment

More imagination less production line thinking please

I could pick any number of tweets over the last couple of weeks telling us how hypothesis are better than questions, are better than stakeholder stories, are better than user stores, are better than use cases are better than features, are better than tasks etc.

OK there is some merit in these thoughts. Kinda.

What worries me is the monumental lack of imagination sitting behind this constant rehashing of the same basic idea.

We get it we do. You have some work. Break it down into small chunks of work and then work on them one at a time or in a batch. It’s a production line. Introduced by Ford in 1913 (but probably pre-dating that by a few hundred years). Of course Toyota did something with their one piece flow. However the fundamental idea remains the same.

So why is everyone applying this same mindset to knowledge work or more specifically software development? I thought I’d explore the “Product Roadmap” to think about how things could be done differently.

A traditional roadmap would consist of a list of features with timescales against each. You can almost smell the factory feel to this approach. Imagine new models of car rolling off the production line. One at a time. No uncertainty in this vision.

So people are challenging this view of things. They want to change the concept of a “feature release”, to be something else perhaps a lean startup style list of hypothesis. Or framing the feature as a question to be answered. This is a small step forward for sure. But surely not a paradigm shift.

What I find ironic in this example is we have the essence of at least one different way of doing things in the name itself. A roadmap could push us in a totally different direction. Imagine mapping our knowledge about the environment in which the product exists?

This is not the pin-point accurate maps we get these days, more the sketchy mappa-mundi of days gone by. After all we now where we are and we know even better where we’ve been. The rest of the map can then explore possible future directions we can take. Perhaps sign-posting areas of interest we should consider visiting. Or the classic “here be dragons” area we should avoid.

Neither is this just a plea for more creativity. There is some theoretical basis to the shift in mindset too. Complex systems research suggest that by considering a multiple viewpoints (both at different scales and from different perspectives) this helps better understand the dynamics of the system we are considering. Also the use of options are often promoted as a way of thriving when facing complexity. Both of these can be easily visualised in map format.

So please start challenging the production line whenever the work becomes more about thinking than it does about doing.

Posted in Agile | Leave a comment

Sitecore MVC – A Dynamic Model

This is the second post in a series about using Sitecore MVC out-of-the-box. Last weeks post explored how to put together a simple page. This week I’ll try and create a typical navigation control which will involve pulling out data from the hierarchy of child items. While I believe this would be possible using views alone, the code would quickly become very messy, so I’m going to start by looking at creating my own model.

I’m still working with the a view rendering. Personally I think we only need to drop to a controller rendering when we have some kind of interaction with the user. Paging through a list of articles for example. I’ll want to fill the Model field shown below with a class I write myself.

View rendering

To get started I need to create a class with the IRenderingModel interface. Note I’m playing with formatting to aid readability in this blog format. Here I’m just outputting a title field when the view calls @Model.Title

using Sitecore.Mvc;
using Sitecore.Mvc.Helpers;
using Sitecore.Mvc.Presentation;
using System.Web;

namespace SunTzu.Models
{
   public class SimpleModel : IRenderingModel
   {
      private SitecoreHelper helper;

      public HtmlString Title
      {
         get
         {
            return helper.Field("title");
         }
      }

      public void Initialize(Rendering rendering)
      {
         helper = PageContext.Current.HtmlHelper.Sitecore();
      }
   }
}

I’m sure the decision to push the call to the helper function inside the model goes against the grain for many (please feel free to comment below). However my design goal is to keep the views simple and focused on final published page rendering. Ideally the views know nothing of Sitecore. They contain only domain presentation logic and layout. On top of this I believe it is the responsibility of the Render Field Processors to add Page Editor magic not the helper. Finally I haven’t violated the principle of having page mark up in the model code which John West warns about.

To register this class with Sitecore I need to create a Model item in Sitecore. At this time Rocks doesn’t provide a nice Add Model option, so either go to the content editor (which does) or use Add Item and select the template /sitecore/templates/System/Layout/Model. You’ll need to add a reference to your full class name and the assembly it is compiled in to the Model Type field. Assuming my dll above is called Suntzu.dll then I enter:

SunTzu.Models.SimpleModel,SunTzu

Finally I add the full path to this Model item into my View Rendering. The view itself is simple. As ever we get the full page editor experience, including MVT and personalisation logic.

@model SunTzu.Models.SimpleModel
<h1>@Model.Title</h1>

Now if you like strongly typed views then we’re done. However the thought of having to write C# code every time I add a field to a template or to reference parent, sibling or child items, fills me with dread. If I’m going to be doing something which requires complex business or presentation logic but when all my views are doing is traversing the content database structure I’d prefer to write one model and be done. Life’s too short.

So time to go dynamic. Using the class above as a base I now inherit from the DynamicObject and override the TryGetMember method. I know have a single model I can use to simplify the format of my view files which only requires compiling once, after that I can add Templates, ViewRenderings and Razor files as I please. Giving me a very rapid development cycle. OK the downside is I lose Intellisense and I’ve will need to deal with field names with spaces in some how.

using Sitecore.Mvc;
using Sitecore.Mvc.Helpers;
using Sitecore.Mvc.Presentation;
using System.Dynamic;

namespace SunTzu.Models
{
   public class FieldOnlyModel :
         DynamicObject, IRenderingModel
   {
      private SitecoreHelper helper;

      public override bool TryGetMember(
            GetMemberBinder binder,
            out object result)
      {
         result = helper.Field(binder.Name);
         return result == null ? false : true;
      }

      public void Initialize(Rendering rendering)
      {
         helper =
            PageContext.Current.HtmlHelper.Sitecore();
      }
   }
}

So far I’m really back to where I was last week. Accessing fields from within views. Next step is to move up and down the content tree. To do this I’m going to need a DynamicItem. I’m going to move my field rendering code into this new class and then get the model to inherit from this new type.

using Sitecore.Data.Items;
using Sitecore.Mvc.Helpers;
using System.Dynamic;

namespace SunTzu.Models
{
   public class DynamicItem : DynamicObject
   {
       protected SitecoreHelper helper;
       protected Item item;

       public DynamicItem() { }

       public DynamicItem(
          SitecoreHelper helper, Item item)
       {
          this.helper = helper;
          this.item = item;
       }

       public override bool TryGetMember(
             GetMemberBinder binder,
             out object result)
       {
          result = helper.Field(binder.Name, item);
          return result == null ? false : true;
       }
   }
}

And the model now just hooks into the Sitecore Initialize call.

using Sitecore.Mvc;
using Sitecore.Mvc.Presentation;

namespace SunTzu.Models
{
    public class DynamicModel : DynamicItem, IRenderingModel
    {
       public void Initialize(Rendering rendering)
       {
          this.item = rendering.Item;
          this.helper =
             PageContext.Current.HtmlHelper.Sitecore();
       }
    }
}

That’s the refactoring done. Now I can start adding to the DynamicItem. First the simple case of accessing the Parent.

       public DynamicItem Parent
       {
          get
          {
             return new DynamicItem(
                this.helper, this.item.Parent);
          }
       }

The view can now access all fields on the parent item.

@model dynamic
<h1>@Model.Parent.Title</h1>
<h2>@Model.Title</h2>

So we’re getting there. Back at the beginning I promised the code for a navigation control. To achieve this I need the access to child items.

       public List<DynamicItem> Children
       {
          get
          {
             List<Item> children = new List<Item>
                (this.item.Children);

             return children.ConvertAll<DynamicItem>
                (child => new DynamicItem(this.helper, child));
          }
       }

So here is the final view. I can see already extensions are needed to check for empty fields and the template of the content items I’m iterating through. I’ll leave that out as implementation detail for now.

@model dynamic
@foreach (var headerSection in @Model.Children)
{
<ul class="sections">
  <li class="section">
        @headerSection.Link

    <ul class="columns">
        @foreach (var column in headerSection.Children)
        {
      <li class="column">
        <ul class="links">
          <li class="title">@column.Title</li>
          <li class="gap"></li>
            @foreach (var item in column.Children)
            {
          <li class="link">@item.Link</li>
            }                
        </ul>
      </li>
        }
    </ul>
  </li>
</ul>
}

Reference

http://www.sitecore.net/Community/Technical-Blogs/John-West-Sitecore-Blog/Posts/2012/06/Sitecore-MVC-Playground-Part-5-Using-Models.aspx

Posted in Sitecore, Sitecore MVC | Tagged | Leave a comment