So I'm starting to go through some of the content on Microsoft Virtual Academy but I don't have time to do this when I'm by a good internet connection and at the same time don't have time when by the good internet connection to run though all the content I need to download it to watch in my spare time.
Enter PowerShell
As with any scenario like this, you turn to code. I used to create all these types of functionality in C# apps but have recently decided that it's too easy in C# and wanted to learn PowerShell better so now these are all PowerShell scripts.
The Code
There are 2 files.
Remove-InvalidFileNameChars.ps1
Function Remove-InvalidFileNameChars {
[CmdletBinding()]
param([Parameter(Mandatory=$true,
Position=0,
ValueFromPipeline=$true,
ValueFromPipelineByPropertyName=$true)]
[String]$Name
)
$result = [RegEx]::Replace($Name, "[{0}]" -f ([RegEx]::Escape
([String][System.IO.Path]::GetInvalidFileNameChars())), '')
$result = $result.Replace("Microsoft-Virtual-Academy", "")
while($result.Contains("--"))
{
$result = $result.Replace("--", "-")
}
return $result.Trim("-")
}
Download-Content.ps1
cls
Write-Output ""
Write-Output ""
Write-Output ""
Write-Output ""
Write-Output ""
Write-Output ""
Write-Output ""
Write-Output ""
Write-Output ""
Write-Output ""
Write-Output "Starting"
$certPages = $("mcsd-application-lifecycle-management",
"mcsd-windows-store-apps-certification","mcsd-web-apps-certification")
$baseSaveLocation = "Z:\Learning\Microsoft Virtual Academy"
$pathToRemoveInvalidFileNameCharsScript = "$baseSaveLocation\Remove-InvalidFileNameChars.ps1"
. "$pathToRemoveInvalidFileNameCharsScript"
foreach($certPage in $certPages)
{
$certUrl="http://www.microsoft.com/learning/en-us/$certPage.aspx"
Write-Output "Downloading Page: $certUrl"
$pageHtml=invoke-webrequest -uri $certUrl
$destination="$baseSaveLocation\$(Remove-InvalidFileNameChars
$certPage.Replace(" ","-"))"
if (!(Test-Path $destination)) {
New-Item -ItemType directory -Path $destination
}
$filteredLinks = $pageHtml.Links | Where-Object
{ $_.href.ToLower().StartsWith("http://www.microsoftvirtualacademy.com/training-courses/") }
foreach($link in $filteredLinks)
{
Write-Output "Downloading Page: $($link.href)"
$trainingPageHtml=invoke-webrequest -uri $link.href
[string]$trainingTitle = $trainingPageHtml.AllElements | Where-Object
{ $_.tagName.ToLower().StartsWith("title") } | Select-Object -ExpandProperty innerText {$_}
$trainingTitle = $trainingTitle.Trim()
$trainingDestination="$destination\$($link.innerText)-
$(Remove-InvalidFileNameChars $trainingTitle.Replace(" ","-"))"
if (!(Test-Path $trainingDestination)) {
New-Item -ItemType directory -Path $trainingDestination
}
$idEducationTypePattern = [regex]"parent='(?<idEducationType>[^']*)'"
[int]$idEducationType = $idEducationTypePattern.Match
($trainingPageHtml.Content).Groups["idEducationType"].Value
$jsonPostDataPattern = [regex]'showNextPanelEducationTypes
((?<idEducationTypeSelected>[^"]*),(?<idEducationLevelSelected>[^"]*),
(?<isApproved>[^"]*),(?<numControl>[^"]*),this)'
foreach($jsonPostData in $jsonPostDataPattern.Matches($trainingPageHtml.Content))
{
[int]$idEducationTypeSelected = $jsonPostData.Groups
["idEducationTypeSelected"].Value.TrimStart("(")
[int]$idEducationLevelSelected = $jsonPostData.Groups["idEducationLevelSelected"].Value
[int]$numControl = $jsonPostData.Groups["numControl"].Value
$isApproved = $jsonPostData.Groups["isApproved"].Value
$jsonPostUri = "http://www.microsoftvirtualacademy.com/Studies/
EducationDetails.aspx/ShowNextPanelEducationTypes"
$jsonPostBody = @{
idEducationTypeSelected = "$idEducationTypeSelected";
idEducationLevelSelected = $idEducationLevelSelected;
isSubscribe = $true;
selectedIsEnabled = "true";
numControl = $numControl;
idEducationType = "$idEducationType";
isApproved = "$isApproved";
culture="en-US"
}| ConvertTo-Json
$jsonInvokeResult = Invoke-RestMethod -Method Post -Uri $jsonPostUri
-ContentType "application/json" -Body $jsonPostBody
$ctIDPattern = [regex]'openEmbeddedVideo((?<Id>.+));'
foreach($ctID in $ctIDPattern.Matches($jsonInvokeResult.d.HtmlResponseMaterials))
{
$Id = $ctID.Groups["Id"].Value.TrimStart("(").Split(',')[0]
$uriForTrainingVideo =
"http://www.microsoftvirtualacademy.com/Content/ViewContent.aspx?
et=$idEducationType&m=$idEducationTypeSelected&ct=$Id"
Write-Output "Downloading Page: $uriForTrainingVideo"
$trainingVideoPageHtml=invoke-webrequest -uri $uriForTrainingVideo
[string]$trainingVideoTitle = $trainingVideoPageHtml.AllElements |
Where-Object { $_.tagName.ToLower().StartsWith("title") } |
Select-Object -ExpandProperty innerText {$_}
$trainingVideoTitle = $trainingVideoTitle.Trim()
[string]$videoMp4Link = $trainingVideoPageHtml.Links | Where-Object
{$_.href.EndsWith(".mp4") -and $_.outerText -eq "MP4"} |
Select-Object -ExpandProperty href {$_}
$urlFileName = $($videoMp4Link.split("/")[-1])
$videoMp4LocalFullName = "$($trainingDestination +"\" +
$(Remove-InvalidFileNameChars $trainingVideoTitle.Replace(" ","-"))).mp4"
if(!(Test-Path ($videoMp4LocalFullName)))
{
Write-Output "Downloading Video: $trainingVideoTitle to $videoMp4LocalFullName"
Start-BitsTransfer "$videoMp4Link" $videoMp4LocalFullName
}
}
}
}
}
Write-Output "Done"
Configuration
$baseSaveLocation
- Set this to the location you want to save the videos to $certPages
- This is an array of certification page names that you want to download content from, i.e.: for MCSD: Application Lifecycle Management which has a url http://www.microsoft.com/learning/en-us/mcsd-application-lifecycle-management.aspx, the page name would be mcsd-application-lifecycle-management
. Note that you will exclude the extension. $pathToRemoveInvalidFileNameCharsScript
- This can stay the same unless you have placed the first script in a different location from the base location.
Next Steps
The next steps for this script will be to try and get it to download the PowerPoint slides for each video. Currently, you need to be logged in to download slides, so this might not be ready for a while.