I have been working on a project using the .NET SDK for Hadoop.  I wanted to add some unit tests to the project, so I ended up writing some fakes for HDInsightClient, JobSubmissionClientFactory and JobSubmissionClient. I was hoping I might be able to reuse some fakes from the SDK git repo, but it seems like their unit tests actually stand up an instance of Hadoop. I didn’t want to actually stand up an instance; I’m treating Hadoop like a black box and I’m more interested in getting code coverage on all the C# code around the calls to Hadoop.

For my fake of IHDInsightClient, I only implemented CreateCluster() and DeleteCluster(), nothing fancy.

I had to make my own interface and wrapper to have a factory that would make a JobSubmissionClient (which is the same thing that the SDK did for its cmdlets):

public interface IAzureHDInsightJobSubmissionClientFactory  
{
    IJobSubmissionClient Create(IJobSubmissionClientCredential credentials);
}    

Then, for the service itself, I implement this interface using the static JobSubmissionClientFactory:

public class AzureHDInsightJobSubmissionClientFactory : IAzureHDInsightJobSubmissionClientFactory  
{
    public IJobSubmissionClient Create(IJobSubmissionClientCredential credentials)
    {
        return JobSubmissionClientFactory.Connect(credentials);
    }
}

Whenever I need a JobSubmissionClient, I get one using my wrapper.

In the case of my fake, I have the factory return a new fake job submission client:

public class FakeJobSubmissionClientFactory : IAzureHDInsightJobSubmissionClientFactory  
{
    public Microsoft.Hadoop.Client.IJobSubmissionClient Create(Microsoft.Hadoop.Client.IJobSubmissionClientCredential credentials)
    {
        return new FakeJobSubmissionClient();
    }
}

Finally, for my FakeJobSubmissionClient, I do need to fake the work that the job does in Hadoop. In this case, it writes a file to blob storage as a result of the Hive query it runs. So, since my fixture has a static reference to a fake blobClient, I was able to fake the work that Hadoop would do in my implementation of CreateHiveJob(HiveJobCreateParameters hiveJobCreateParameters).

With all these fakes, I then wired up dependency injection in my UnityContainer and I was good to go. And now I have much more confidence that future changes to this codebase won’t cause regressions.