Category: PowerShell

Effectively comparing Team Build Process Templates

I always prefer implementing .NET build customizations through MSBuild and I avoid modifying the Windows Workflow XAML files used by Team Build. However, some customizations are best implemented in the Team Build process, like chaining builds to execute in succession and pass information between them. As a consultant specializing in automated build an deployment I also spend a lot of time understanding Workflow customizations implemented by others.

For me the easiest way to understand the customizations implemented in a particular Team Build XAML file is to use a file differencing tool to compare the current workflow to a previous version of the workflow, or even to compare it to the default Team Build template it was based on. Unfortunately, the Windows Workflow designer in Visual Studio litters the XAML file with a lot of view state, obscuring the intended changes to the build process amongst irrelevant designer-implementation concerns.

To address this problem, I wrote a PowerShell script (available as a GitHub Gist) which removes all the elements and attributes from the XAML file which are known to be unimportant to the process it describes. Conveniently, the XAML file itself lists the set of XML namespace prefixes that can be safely removed in an mc:Ignorable attribute on the root document element.

Typically I use my XAML cleaning PowerShell script before each check-in to ensure the source control history stays clean but I have also used it on existing XAML files created by others to canonicalize them before opening them in a diff tool.

Using the script is as simple as:

.\Remove-IgnoreableXaml.ps1 -Path YourBuildTemplate.xaml

 

Or, if you don’t want to overwrite the file in place, specify an alternate destination:

.\Remove-IgnoreableXaml.ps1 -Path YourBuildTemplate.xaml -Destination YourCleanBuildTemplate.xaml

 

PowerShell Select-Xml versus Get-Content

In PowerShell, one of the most common examples you will see for parsing an XML file into a variable uses the Get-Content cmdlet and the cast operator, like this:

$Document = [xml](Get-Content -Path myfile.xml)

The resulting type of the $Document variable is an instance of System.Xml.XmlDocument. However, there is another approach to get the same, or better, result using the Select-Xml cmdlet:

$Document = ( Select-Xml -Path myfile.xml -XPath / ).Node

Sure, using the second variant is slightly longer, but with an important benefit over the first, and it’s not performance related.

In the first example, the file is first read into an array of strings and then cast. The casting operation (implemented by System.Management.Automation.LanguagePrimitives.ConvertToXml) is using an XmlReaderSettings instance with the IgnoreWhitespace property set to true and an XmlDocument instance with the PreserveWhitespace property set to false.

In the second example, the file is read directly into an XmlDocument (implemented by System.Management.Automation.InternalDeserializer.LoadUnsafeXmlDocument) using an XmlReaderSettings instance with the IgnoreWhitespace property set to false and an XmlDocument instance with the PreserveWhitespace property set to true – the opposite values of the first example.

The Select-Xml approach won’t completely preserve all the original formatting from the source file but it preserves much more than the Get-Content approach will and I’ve found this extremely useful when bulk updating version controlled XML files with a PowerShell script and wanting the resulting file diff to show the intended change and not be obscured by formatting changes.

You could construct the XmlDocument and XmlReaderSettings directly in PowerShell but not in so few characters. You can also load the System.Xml.Linq assembly and use the XDocument class which appears to give slightly better formatting consistency again but it’s still not perfect and PowerShell doesn’t provide the same quick access to elements and attributes as properties on the object.

PowerShell Desired State Configuration Nested Configurations

In PowerShell v4’s new Desired State Configuration feature, each Node¹ defined within a Configuration results in a single MOF file and each computer’s Local Configuration Manager will only apply and monitor configuration from a single MOF file at a time. This might suggest that all the complexity of the Resource configuration for an entire computer needs to be captured within the a single Node and that DSC configuration files will become big and unwieldy fast.

However, I recently have noticed that the Get-DscResource cmdlet returns not only the built-in and custom resources installed on the computer but it also lists any Configurations defined in the current session. From this I inferred that it may be possible to include one Configuration within another so that the overall configuration of a system can be modularized into more manageable pieces.

I had an opportunity to raise this with a member of the PowerShell team and it was clarified for me that this is indeed possible, and presumably recommended. The key is that the resulting combined configuration definition must only contain one level of Nodes. That is, either the parent Configuration defines the Node, or the child Configuration defines the Node, not both.

Here is an example of defining two child Configurations which combine related Resources and a parent Configuration which then defines a Node containing both of these Configurations:


Configuration AspNet45WebServer {
WindowsFeature WebAspNet45 {
Name = 'Web-Asp-Net45'
Ensure = 'Present'
}
WindowsFeature WebWindowsAuth {
Name = 'Web-Windows-Auth'
Ensure = 'Present'
}
}
Configuration AspClassicWebServer {
param ($EnsureBasicAuth)
WindowsFeature WebAsp {
Name = 'Web-ASP'
Ensure = 'Present'
}
WindowsFeature WebBasicAuth {
Name = 'Web-Basic-Auth'
Ensure = $EnsureBasicAuth
}
}
Configuration CombinedWebServer {
Node localhost {
AspNet45WebServer aspnet45 {}
AspClassicWebServer classic {
EnsureBasicAuth = 'Present'
}
}
}

A name still needs to be associated with the child Configurations when they are included in a parent, but they may or may not require additional parameters. There is no reason that the nested Configurations be limited to any particular resource, I’ve just used the WindowsFeature resource in this example to avoid external dependencies.

¹Subject to the use of Configuration Data which may repeat a single Node multiple times.

PowerShell v4 Desired State Configuration at Sydney DevOps

I volunteered to speak about the new Desired State Configuration features in PowerShell 4.0 at the local DevOps user group in Sydney on September 19th. The technology has a lot of potential but until it is officially released and all the documentation is available, I found some aspects of Desired State Configuration difficult to understand.

If you’d like to watch my presentation, the user group was live broadcast as a Google Hangout and is now available on YouTube (my session starts at about 17 minutes in). Additionally, my colleague Meligy also painstakingly recorded the presentation with his phone camera, also available on YouTube.

The resolution of the recorded presentation may not be sufficient to read the detail on the slides or the PowerShell commands being executed during the demo. If you like to follow along at home, the slides are available on SlideShare.net and the demo script is available as a GitHub Gist.

Refactor Test Methods associated with Test Manager Test Case Automation

Since Team Foundation Server 2010, Microsoft has shipped the Test Manager product which enables testers to document manual test cases and later automate them (or just automated them from the beginning). When used to its full potential with other TFS components like Team Build and Lab Management, managing Test Cases with TFS provides a great way to improve and verify the quality of the software you deliver.

In Test Manager, or more accurately Visual Studio, automation for a Test Case is associated via two primary identifiers:

  1. The file name of the .NET Assembly containing the relevant Test Method, eg “CodedUITestProject1.dll”
  2. The namespace- and class-qualified name of the Test Method, eg “CodedUITestProject1.CodedUITest1.CodedUITestMethod1”

After associating one or more Test Methods with Test Cases you may later need to refactor your code resulting in many Test Methods moving to a new class or namespace, or even a new assembly name. Unfortunately, after this refactoring effort, Microsoft Test Manager will no longer be able to locate the Test Methods associated with affected Test Cases and the tests will fail to run.

There are some options for fixing this, of varying effectiveness:

  • Undo the refactoring – pointless but technically an option.
  • Manually open each one of the Test Cases in Visual Studio and re-associate the automation – tedious for larger numbers of tests.
  • If only the assembly name has changed, open the list of Test Cases in Excel, include the “Automated Test Storage” column, and bulk update the values.

BUT, if the namespace or class has changed, the fix could be slightly more complicated…

You may be able to use Excel with the “Automated Test Name” column included and you may find it works but there is also another column to consider, usually hidden, called “Automated Test Id”. I know this column is used by the TCM command-line tool to automatically import Test Methods into Test Cases and it may be used in other areas of Visual Studio and TFS.

In tcm.exe, the Id is based on the Test Name and used to detect if a Test Method has already been imported and to prevent the creation of duplicate Test Cases. Gautam Goenka’s blog post on associating automation programmatically demonstrates the simple Id generation algorithm used.

As per my usual style, I’ve published a PowerShell script, called “Rename-TestCaseAutomation” on GitHub Gist, which will locate all the Test Cases with associated automation using a specified Assembly Name and/or Class Name prefix and update them with a provided new Assembly Name or Class Name Prefix, and update the Id appropriately too. Here are two simple example scenarios for using it:

Rename the Assembly for all Test Cases across two different Team Projects:

Rename-TestCaseAutomation http://localhost:8080/tfs/DefaultCollection MyProject,YourProject -TestStorage CodedUITestProject1.dll -NewTestStorage FooTests.dll

Move all the Tests Methods in the “FooTests.Customer” Class to to the “FooTests.Client” Class:

Rename-TestCaseAutomation http://localhost:8080/tfs/DefaultCollection MyProject -TestNamePrefix FooTests.Customer -NewTestNamePrefix FooTests.Client

Hopefully someone else finds this useful.

Ad hoc IIS log parsing with PowerShell

There are numerous log analysis systems for IIS and for log files in general and it is probably a good idea to use such a system for ongoing monitoring. However, sometimes you just have a bunch of IIS logs and some simple questions you want to ask of the data within. PowerShell is built into the OS so its an easy default choice for this scenario.

Unfortunately the IIS W3C Extended log format doesn’t quite align with PowerShell’s built-in cmdlets. The Import-Csv or ConvertFrom-Csv cmdlets could come close when used with the -Delimiter parameter and some Header wrangling. You can even achieve a fair amount just by using Get-Content and the -split operator if you don’t mind using array indexes to access different columns.

With a little bit of extra regex and hashtable work, wrapped in a function for readability, the logs can be parsed quite simply and even handle the situation where the set of included columns changes half way through one of the log files. I wrote such a function in about 5 minutes the other day, called it “ConvertFrom-IISW3CLog” and put it on GitHub as a Gist for future reference here. It won’t handle malformed log files but the point was to keep it simple, and not to re-invent another log system.

You can use the function like this:

gci c:\inetpub\logs\LogFiles\W3SVC1 | ConvertFrom-IISW3CLog

If you wanted to find how often each URL is accessed, a simple Group-Object on the end of the pipe would tell you:

gci c:\inetpub\logs\LogFiles\W3SVC1 | ConvertFrom-IISW3CLog | group cs-uri-stem

If you want to to find which URL requests have had errors today:

$reqs = gci c:\inetpub\logs\LogFiles\W3SVC1 | ConvertFrom-IISW3CLog
$reqs | ? { $_.'sc-status' -eq 500 -and $_.date -eq '2013-08-10' }

Once your log parsing gets more elaborate, you might want to look at the Microsoft Log Parser as a more capable solution.

PowerShell Update-Help and an Authenticating Proxy

PowerShell v3 doesn’t ship with help in the box anymore. You may love this or you may hate it. Regardless of your stance, if your environment is behind an authenticating web proxy, it is not obvious how to make it work. The general guidance is to use Save-Help from another computer but this doesn’t help when every computer is behind the proxy and sneakernet is prohibited. This was my situation recently and I found a reasonably simple way to solve it.

Firstly, when trying to run Update-Help, I received an error message:

update-help : Failed to update Help for the module(s) ' ... ' with UI culture(s) {en-US} :
Unable to connect to Help content. Make sure the server is available and then try the command again.

Unfortunately, nothing about this message tells me that the problem is caused by the proxy, let alone authentication. Having dealt with the proxy in this environment before I had a suspicion, so I opened Fiddler (an HTTP debugger) and re-attempted the Update-Help command. Fiddler revealed that the HTTP response returned to PowerShell was status code 407, “Proxy authentication required”, so I was on the right track.

Inspecting Update-Help’s available parameters reveals two that might be relevant: Credential and UseDefaultCredentials. To save you some time, I can tell you that using either of these won’t help a proxy authentication issue. Under the hood, Update-Help is using .NET’s WebClient class to download the help information and these credential-related parameters correspond directly to properties of the same name on WebClient which don’t apply to proxies. Separately though, WebClient has a Proxy property which in turn has a Credentials property – this is the one I needed to find a way to set.

Luckily, even though the WebClient instance used within the implementation of the Update-Help cmdlet is not exposed, by default all instances of WebClient in the same AppDomain will share the same instance of their Proxy. So, firstly, to verify my theory, I run these commands:

$wc = New-Object System.Net.WebClient
$wc.DownloadString('http://microsoft.com')

And get the expected error:

Exception calling "DownloadString" with "1" argument(s):
"The remote server returned an error: (407) Proxy Authentication Required."

So I set the credentials on my test WebClient’s Proxy to use my current credentials that I logged in to Windows with, and test again:

$wc.Proxy.Credentials = [System.Net.CredentialCache]::DefaultNetworkCredentials
$wc.DownloadString('http://microsoft.com')

And this time I get the raw content of the Microsoft.com home page instead of an error.

I now try to run Update-Help one more time and bingo! – it works.

Now we just need to encourage Microsoft to make this easier to deal with in general by voting for these two issues on Connect.

Get Hyper-V guest data without XML parsing

I recently needed to query the Hyper-V KVP Exchange data for a guest VM to find the currently configured IPv4 address of the VM’s network adapter. A quick search of the Internet reveals that the Msvm_KvpExchangeComponent WMI class is the source of this information and there are at least two blog posts that cover it well:

However, in both of these blogs, the actual data comes back as XML which is then parsed using XPath. The original XML looks something like this:

<INSTANCE CLASSNAME="Msvm_KvpExchangeDataItem">
 <PROPERTY NAME="Data" TYPE="string">
  <VALUE>169.254.103.5</VALUE>
 </PROPERTY>
 <PROPERTY NAME="Name" TYPE="string">
  <VALUE>RDPAddressIPv4</VALUE>
 </PROPERTY>
 <PROPERTY NAME="Source" TYPE="uint16">
  <VALUE>2</VALUE>
 </PROPERTY>
</INSTANCE>

As soon as I saw this XML I recognised it as the DMTF CIM XML format – the same format that the new PowerShell v3 CIM Cmdlets use to transport CIM instances over HTTP (I believe). If this is the format used by PowerShell, it seemed a reasonable assumption  that PowerShell or the .NET Framework must already have an implementation for deserializing this XML properly so I don’t have to code it myself.

With Reflector in hand, I started my investigation at ManagementBaseObject.GetText which converts to XML but I couldn’t find any complementary methods to go the other direction. I then proceeded to look at ManageBaseObject’s implementation of the ISerializable interface and corresponding constructor but that appears to use binary serialization of COM types.

Finally, I turned to the CimInstance class and its implementation of ISerializable and discovered the CimDeserializer. Unfortunately the CimDeserializer methods take a byte array as input and I had strings. So assuming round-tripping should work, I looked to the CimSerializer and tried passing it a CimInstance and inspected the byte array that was returned – every second byte is zero, and the rest fit within 7-bits… smells like Unicode.

Taking a small gamble I took the strings from the Msvm_KvpExchangeComponent instance, used System.Text.Encoding.Unicode to convert them to byte arrays and passed them to CimDeserializer.DeserializeInstance. Huzzah! Properly deserialized Msvm_KvpExchangeDataItem instances.

And here is the final PowerShell script to return the items for a given VM name: https://gist.github.com/jstangroome/6068782

PSClrMD – A PowerShell module for CLR Memory Diagnostics

Back in May, the .NET Framework team blogged about a new set of advanced APIs for programmatically inspecting a live process or crash dump of a .NET application. These APIs are called “CLR Memory Diagnostics” or “ClrMD” for short and are available as a pre-release NuGet package called “Microsoft.Diagnostics.Runtime” – I think there may be some naming issues yet to be resolved.

Based on some of the examples on their blog post demonstrating a LINQ-style approach I thought this library would be very useful in a PowerShell pipeline scenario as well. Although there is already a PowerShell module for debugging with WinDbg (PowerDbg), I wanted the practice of building a PowerShell module and the opportunity to play with the ClrMD library.

Today I started building the first set of cmdlets based on the examples demonstrated in the blog’s code samples and have published the code on GitHub. The cmdlets so far are:

  • Connect-ClrMDTarget – establishes the underlying DataTarget object by attaching to a live process or loading a crash dump file.
  • Get-ClrMDClrVersion – lists the versions of the CLR loaded in the connected process. Typically just one.
  • Connect-ClrMDRuntime – establishes the underlying ClrRuntime object to query .NET-related information. Defaults to the first loaded CLR version in the process.
  • Get-ClrMDThread – lists the CLR threads of the connected CLR runtime.
  • Get-ClrMDHeapObject – lists all the objects in the heap of the connected CLR runtime.
  • Disconnect-ClrMDTarget – detaches from the connected process.

The ClrMD API is centered around having a DataTarget and ClrRuntime instance as context for performing all other operations. In PowerShell, it would be awkward to pass this context as a parameter to every cmdlet so I wrote the Connect cmdlets to store the context in a module variable which all other cmdlets will naturally inherit. If desired however, the Connect cmdlets accept a -PassThru switch which will output a context object which can then be passed explicitly to the –Target or -Runtime parameters of the other cmdlets. This would enable two or more processes to be inspected simulataneously, for example.

Included in the source repository is a drive.ps1 script which I used during development to repeatedly try different scenarios and set some default formatting for threads and heap objects. One example in this script is finding the first 20 unique string values in the process, here is an excerpt:

Import-Module -Name PSClrMD
Connect-ClrMDTarget -ProcessId $PID
Connect-ClrMDRuntime
Get-ClrMDHeapObject | 
 Where-Object { $_.IsString } | 
 Select-Object -Unique -First 20 -Property SimpleValue
Disconnect-ClrMDTarget

From this you can hopefully see how easy it can be to connect to a running process (PowerShell itself in this case) and query interesting data. From this it also appears that I should try to combine the two Connect cmdlets into one for the common scenario demonstrated here.

Another example to be found in the drive.ps1 script is listing the top memory-consuming objects by type which, when combined with scheduled tasks and Export-Csv, could provide a simple monitoring solution.

You can download the compiled module here or feel free to get the source and submit a Pull Request for anything you add – I’ve only scratched the surface of what ClrMD exposes.

Update: there is also a scriptcs Script Pack for ClrMD if PowerShell is not your style.

PowerShell v3 for Developers

In November last year I presented, to a small group of my colleagues one evening, a summary of the new features in PowerShell v3 and why developers should care about PowerShell given that most PowerShell marketing targets the IT Pro types with their AD and Exchange management needs. I also spoke briefly about integrating PowerShell and C#.

The screencast of this presentation (~24 minutes) is available on the Readify in the Community website.

In February just past I presented a more detailed (with demos and code) and much more polished version of this presentation to the Sydney .NET User Group. Thanks to an excellent video recording and post-production setup at this user group, the full video of my presentation (~1 hour) is available to watch online on the SSW TV website.

Also, in the theme of PowerShell for Developers, Microsoft has just released the PowerShell 3 SDK Sample Pack containing many C# examples for extending PowerShell in numerous ways.