Aggregating iOS PowerLog data using C# – Part 1
If you haven’t already heard of Sarah Edwards’ APOLLO (Apple Pattern of Life Lazy Output’er), you should probably stop reading this and go check that out first. This article builds on Sarah’s work, specifically with PowerLogs.
One of the challenges with PowerLogs is that on a daily basis, the device dumps the contents into a compressed archive. This makes sense from Apple’s perspective but makes it difficult / tedious to query en masse for forensic analysis. Our task today will be to programmatically combine data into a unified dataset upon which we can then run our APOLLO queries.
For this work, I’m going to leverage my own GKZipLib, which is a custom ZIP parsing module I created after becoming frustrated with the sluggish performance of open source ZIP libraries available (including the ones native to .NET) for archives of significant size. My research iPhone filesystem is only a 15GB archive, but using GKZipLib parsing 271,361 entries to locate PowerLogs took my machine’s dated hardware a whole 0.5 seconds to complete.
Finally, I’ll be using LINQPad, Joe Albahari’s massively useful creation that has become a critical part of my day to day work. If you do any .NET coding at all, this is a wonderful tool to have in your arsenal for everything from quick and dirty analytical tasks to developing proof of concept code that can ultimately mature into a fully fledged Windows app in future.
So let’s get started. We’ll begin with a few preparatory steps:
// Instantiate our archive var iphoneZip = new GKZipFile(@"D:\a769****_files.zip", false); // Pattern to find our GZipped powerlogs Regex rgxPowerLog = new Regex(@"powerlog_[\w\W]*?\.PLSQL\.gz"); // Place to export all the things var outputPath = Directory.CreateDirectory(@"c:\temp\plUnity\"); // Who doesn't love stats? var filesParsed = 0; var filesExtracted = 0;
From here, thanks to the power of LINQ and the implementation of the IEnumerable interface by GKZipFile, it’s as simple as iterating our archive like it’s a giant array with foreach.
foreach (var file in iphoneZip) { // Check for a GZ powerlog archive. if (rgxPowerLog.IsMatch(file.ShortName)) { Console.WriteLine($"Extracting {file.Name}..."); file.ExtractToFolder(outputPath.FullName); filesExtracted++; } // As well as the 'CurrentPowerLog' // By using IndexOf instead of Contains, we ensure to snag any -shm and -wal files as well if (file.ShortName.IndexOf("CurrentPowerLog.PLSQL", StringComparison.CurrentCultureIgnoreCase) >= 0) { Console.WriteLine($"Extracting {file.Name}..."); file.ExtractToFolder(outputPath.FullName); filesExtracted++; } // Count of files parsed filesParsed++; } Console.WriteLine($"Finished parsing {filesParsed} files."); Console.WriteLine($"A total of {filesExtracted} files were extracted.");
Output:
Extracting /private/var/containers/Shared/SystemGroup/BCBD844C-BDB8-4D6B-8246-555182B5F39A/Library/BatteryLife/Archives/powerlog_2018-10-07_7F9FC438.PLSQL.gz... Extracting /private/var/containers/Shared/SystemGroup/BCBD844C-BDB8-4D6B-8246-555182B5F39A/Library/BatteryLife/Archives/powerlog_2018-10-08_2162C03C.PLSQL.gz... Extracting /private/var/containers/Shared/SystemGroup/BCBD844C-BDB8-4D6B-8246-555182B5F39A/Library/BatteryLife/Archives/powerlog_2018-10-09_0DC64180.PLSQL.gz... Extracting /private/var/containers/Shared/SystemGroup/BCBD844C-BDB8-4D6B-8246-555182B5F39A/Library/BatteryLife/Archives/powerlog_2018-10-10_24F9BF01.PLSQL.gz... Extracting /private/var/containers/Shared/SystemGroup/BCBD844C-BDB8-4D6B-8246-555182B5F39A/Library/BatteryLife/CurrentPowerlog.PLSQL-shm... Extracting /private/var/containers/Shared/SystemGroup/BCBD844C-BDB8-4D6B-8246-555182B5F39A/Library/BatteryLife/CurrentPowerlog.PLSQL-wal... Extracting /private/var/containers/Shared/SystemGroup/BCBD844C-BDB8-4D6B-8246-555182B5F39A/Library/BatteryLife/CurrentPowerlog.PLSQL... Finished parsing 271361 files. A total of 7 files were extracted.
The final thing we will do in part 1 is extract our GZ files in place so that they are accessible for querying. To simplify things, I wrote a function to do this which simply removes the GZ extension to determine the output filename. This logic could certainly be flawed for a generic GZ decompression routine but in this case we can rely on the fact that our GZipped files will have the .GZ extension.
void GZExtract(string inputFile) { using (var fs = new FileStream(inputFile, FileMode.Open, FileAccess.Read, FileShare.Read)) { using (var gzstr = new GZipStream(fs, CompressionMode.Decompress)) { const int buffSize = 4096; byte[] buffer = new byte[buffSize]; using (var uncompressedData = new FileStream(inputFile.Replace(".gz", ""), FileMode.Create)) { var bytesRead = 0; do { bytesRead = gzstr.Read(buffer, 0, buffSize); if (bytesRead > 0) { uncompressedData.Write(buffer, 0, bytesRead); } } while (bytesRead > 0); } } } }
And finally, invoke the function for each of our GZ powerlogs:
// Decompress all of our gzipped archives foreach (var arc in outputPath.GetFiles("*.gz")) { GZExtract(arc.FullName); }
That’s where we will end off today. Here’s what we’ve completed so far:
- Iterate the iOS filesystem archive, locate files of interest including CurrentPowerLog.PLSQL and associated SQLite artifacts and any GZipped archives, and extracted them to the local machine.
- Decompress all GZipped archives in place.
In part 2, we will look at several different options for amalgamating this data in preparation for running Sarah’s PowerLog scripts against the entire dataset instead of having to do this manually for each one.