The Skulduggery of IEnumerable - Part 2 : Of Enumerators & Generators

In the second part of this series of blog posts regarding IEnumerable, I will talk about Enumerators and Generators which are the two kinds of enumerables within the .NET framework, or at least those that I know of anyway.

What is the difference you ask? Well, they are:

  • Enumerators
    • Enumerators interates/enumerates a collection and returns the same values regardless of how you many times they are enumerated/iterated. The items in the collection naturally changes when you change something within the collection, exactly how you would expect it to work.
  • Generators
    • Generators generate values that may not be the same each time depending on when the generator was called. The key difference here is that, unlike enumerators which will return the same set of values each time and gives the facade of having the values upfront like an ICollection, this does not do that. More on this later.

What does this have to do with IEnumerable you may ask, well all those nice syntax sugar provider by LINQ like say the ".Where()" and ".Select()" methods are actually generators which in turn, return IEnumberables.

Let me show some code to illustrate the differences:

/*
    This IEnumerable is actually an array with values there already.
    This makes it an enumerator.
*/
IEnumerable<string> NamesOfGuitarHeroes = new []
{
  "Jimi Hendrix",
  "Eric Clapton",
  "Micheal Schenker",
  "Rudolph Schenker",
  "Matthias Jabs",
  "John Petrucci",
  "Joe Satriani",
  "Steve Vai",
  "Paul Gilbert",
  "Adrian Smith",
  "Van Halen",
  "Akira Takasaki"
};

/*
    This is a generator because there are no values specified.
    It will not have values until this variable is called upon or has a ".ToList()", "ToDictionary()", etc call to actually tell LINQ to generate a collection for you.
*/
IEnumerable<GuitarHero> GuitarHeroes = NamesOfGuitarHeroes.Select(x => new GuitarHero(x));  

Figure 1

Every time you enumerate this, values returned may not be what you expect. For instance:

Note :
Since we are dealing with IEnumerables, I highly recommend using a for..each loop. Reasons detailed here : http://the-sourceterous.ghost.io/the-skulduggery-of-ienumerable/

IEnumerable<string> NamesOfGuitarHeroes = new []
{
  "Jimi Hendrix",
  "Eric Clapton",
  "Micheal Schenker",
  "Rudolph Schenker",
  "Matthias Jabs",
  "John Petrucci",
  "Joe Satriani",
  "Steve Vai",
  "Paul Gilbert",
  "Adrian Smith",
  "Van Halen",
  "Akira Takasaki"
};

/*
    Guitar hero POCO definition:

    public class GuitarHero
    {
        private string Name { get; set; }
        private bool IsAlive { get; set; }

        public GuitarHero(string guitaristName, bool isAlive = false)
        {
            this.Name = guitaristName;
            this.IsAlive = isAlive;
        }
    }
*/
IEnumerable<GuitarHero> GuitarHeroes = NamesOfGuitarHeroes.Select(x => new GuitarHero(x));

foreach(var GuitarHero in GuitarHeroes)
{
    GuitarHero.IsAlive = true;
}

foreach(var GuitarHero in GuitarHeroes)
{
    const string MessageTemplate = "Guitar Hero {0} is {1}";
    var StateOfMortality = !GuitarHero.IsAlive ? "dead :(" : "still rocking out ! :)"
    var Message = string.Format(MessageTemplate, GuitarHero.Name, StateOfMortality);
    Console.WriteLine(Message);
}

Figure 2

The output message in Figure 2 will always show that all the guitar heroes are dead despite my code making them all alive.

Why?

An interesting behavior of generators is that it will only return values that is returned for the generator's computation. It will always get values on the fly based on the generator statement and DOES NOT take into account any changes made.

How can we overcome this you ask? As the comment in Figure 1 for the second variable alludes to, you can convert a generator to an enumerator but making a call to the ToList(), ToArray(), etc methods that that will cause them to be enumerators which in turn means that changes will be accepted because unlike a generator, new values are not computed each time for an enumerator.

IEnumerable<string> NamesOfGuitarHeroes = new []
{
  "Jimi Hendrix",
  "Eric Clapton",
  "Micheal Schenker",
  "Rudolph Schenker",
  "Matthias Jabs",
  "John Petrucci",
  "Joe Satriani",
  "Steve Vai",
  "Paul Gilbert",
  "Adrian Smith",
  "Van Halen",
  "Akira Takasaki"
};

/*
    Guitar hero POCO definition:

    public class GuitarHero
    {
        private string Name { get; set; }
        private bool IsAlive { get; set; }

        public GuitarHero(string guitaristName, bool isAlive = false)
        {
            this.Name = guitaristName;
            this.IsAlive = isAlive;
        }
    }
*/
IEnumerable<GuitarHero> GuitarHeroes = NamesOfGuitarHeroes.Select(x => new GuitarHero(x)).ToList();

foreach(var GuitarHero in GuitarHeroes)
{
    GuitarHero.IsAlive = true;
}

foreach(var GuitarHero in GuitarHeroes)
{
    const string MessageTemplate = "Guitar Hero {0} is {1}";
    var StateOfMortality = !GuitarHero.IsAlive ? "dead :(" : "still rocking out ! :)"
    var Message = string.Format(MessageTemplate, GuitarHero.Name, StateOfMortality);
    Console.WriteLine(Message);
}

Figure 3

In the code in Figure 3, notice this line of code.

IEnumerable<GuitarHero> GuitarHeroes = NamesOfGuitarHeroes.Select(x => new GuitarHero(x)).ToList();

It now has a ToList() method at the end. This makes this an enumerator thus causing the output from the code in Figure 3 to show all the guitar heroes to be alive and well.

Generators do have their time and place however. Since it always computes/generates values EVERY time it is called, it will reflect the latest state of the source. The code in Figure 4 will illustrate that.

IEnumerable<string> NamesOfGuitarHeroes = new []
{
  "Jimi Hendrix",
  "Eric Clapton",
  "Micheal Schenker",
  "Rudolph Schenker",
  "Matthias Jabs",
  "John Petrucci",
  "Joe Satriani",
  "Steve Vai",
  "Paul Gilbert",
  "Adrian Smith",
  "Van Halen",
  "Akira Takasaki"
};

/*
    Guitar hero POCO definition:

    public class GuitarHero
    {
        private string Name { get; set; }
        private bool IsAlive { get; set; }

        public GuitarHero(string guitaristName, bool isAlive = false)
        {
            this.Name = guitaristName;
            this.IsAlive = isAlive;
        }
    }
*/
var GuitarHeroes = NamesOfGuitarHeroes.Select(x => new GuitarHero(x)).ToList();

foreach(var GuitarHero in GuitarHeroes)
{
    GuitarHero.IsAlive = true;
}

GuitarHeroes[0].IsAlive = false;

var livingGuitarHeroes = GuitarHeroes.Where(x => x.IsAlive);
Console.WriteLine(livingGuitarHeroes.Count()); // Prints 11

GuitarHeroes[0].IsAlive = true;

Console.WriteLine(livingGuitarHeroes.Count()); // Prints 12

Figure 4

As seen in Figure 4, changes to the source collection are immediately visible because a generator enumerable, generates items EVERY TIME it is called and does not retain state.

This can be desired behavior when you have changes going around asyncronously and you want it immediately reflected. The flip side is however, if you wanted a snapshot of the data at a certain point of time, using a generator is bad because it does not retain state/have its own copy of the data. That is where the skulduggery happens. An illustration of this can be seen in Figure 5.

using System;
using System.Collections.Generic;
using System.Linq;

namespace IEnumerablePitfalls2
{
    class Program
    {
        IEnumerable<string> NamesOfGuitarHeroes = new []
        {
          "Eric Clapton",
          "Micheal Schenker",
          "Rudolph Schenker",
          "Matthias Jabs",
          "John Petrucci",
          "Joe Satriani",
          "Steve Vai",
          "Paul Gilbert",
          "Adrian Smith",
          "Van Halen",
          "Akira Takasaki",
          "Dimebag Darell"
        };

        static void Main(string[] args)
        {
            // Guitar heroes master manifest
            var GuitarHeroesManifest = NamesOfGuitarHeroes.Select(x => new GuitarHero(guitaristName: x, isAlive: true)).ToList();

            // A snapshot of guitar heroes alive at 2000
            var guitarHeroesAliveAt2000 = GuitarHeroesManifest.Where(x => x.IsAlive);
            // Prints 12
            Console.WriteLine(string.Concat("Guitar Heroes Alive @ 2000", guitarHeroesAliveAt2000.Count())); 

            var nameOfGuitarHeroSlain = GuitaristShotByCrazyLoon();

            foreach(var GuitarHeroesManifestItem in GuitarHeroesManifest)
            {
                if(nameOfGuitarHeroSlain == GuitarHeroesManifestItem.Name)
                {
                    // Changes Dimebag to dead.
                    GuitarHeroesManifestItem.IsAlive = false;
                }
            }

            // Snapshot of guitarists alive at 2004
            var guitarHeroesAliveAt2004 = GuitarHeroesManifest.Where(x => x.IsAlive);

            // Prints 11
            Console.WriteLine(string.Concat("Guitar Heroes Alive @ 2000", guitarHeroesAliveAt2000.Count()));
            // Prints 11
            Console.WriteLine(string.Concat("Guitar Heroes Alive @ 2004", guitarHeroesAliveAt2004.Count()));
        }

        // This will return the name of a guitar hero who got shot by a crazy loon.
        static string GuitaristShotByCrazyLoon()
        {
            return "Dimebag Darell";
        }
    }

    public class GuitarHero
    {
        private string Name { get; set; }
        private bool IsAlive { get; set; }

        public GuitarHero(string guitaristName, bool isAlive = false)
        {
            this.Name = guitaristName;
            this.IsAlive = isAlive;
        }
    }
}

Figure 5

As we discussed previously, generators will always compute data from the data source EVERY time. This makes it unreliable if you want to store snapshots of data as shown the code sample in Figure 5 whereby the snapshot of the data for the year 2000 got updated with values from 2004 which is not desired behavior.

These subtle behavioral differences can cause you to pull your hair in frustration.

Conclusion

Use a generator if you are sure you don't really care about the state of items within the IEnumerable, you are sure that the nature of the generator to return updated values does not impact your code negatively.

I personally almost always use ToList() so I do not need to worry about it.

Part 1 : The Skulduggery of IEnumerable - Part 1 : Of Count & Element Indexes

Part 2 : The Skulduggery of IEnumerable - Part 2 : Of Enumerators & Generators