Background

My mobile phone has SMS messages that go as far back as my first smart phone (around 2009). Each time that I upgraded, I found ways to transport them to my new phone.

Why? Well, why not? It’s always good to look back through my old messages and reminisce over stupid conversations.

In the case of a few threads, these include folks I’ve lost contact with; and a bunch with a friend who has passed away

The Issue

Shortly after Christmas something happened with my current phone which caused me to have to reflash the currently running ROM. This meant that I’d lost everything.

Luckily, I have a bunch of apps (including this one) that run a bunch of backups EVERYDAY at midday. The SMS backup app that I use takes a full backup of my SMS messages,

Yes, I still use SMS to communicate with some friends

stores them in an XML file and uploads that file to my DropBox account.

Since this happens every day, all I had to do was pull down the previous day’s message backup and restore it onto my phone.

This is where it went a little wobbly.

What Happened

For some reason during the restoration process, my phone crashed and rebooted. This happened a few times (I was doing too many restorations of different things [WhatsApp, HangOuts, etc.] and caused a Kernel Panic).

A few days later, I’d noticed that the restoration process had restored  around 3-5 copies of each of my SMS messages. This was no good as I ended up having lots of superfluous messages to trawl through to find an older one.

Since it had been a few days (a few backups had been taken, and a load of messages received), I couldn’t restore the backup from when my phone crashed because my message threads would be out of date.

Possible Solution

I COULD have manually deleted each of the repeated messages, but I did a total count of the messages and it came out at 25,000

I think that the second word out of my mouth was “that”

I even took to Reddit to ask if anyone knew of an app that I could use to do bulk remove the duplicate apps, but there were no suggestions.

Being a supremely lazy person, and apparently all good programmers are, I thought that there must be a better way to do it.

Program

Why not write a program to go through the latest backup file and remove all duplicate entries? It wouldn’t have to be beautiful, or have a user interface. It also wouldn’t need to have any command line interface. Just dump the XML file in the same folder as the binary, run it, and ‘let her rip’.

Half an hour later I’d written the program, tested it, generated an XML and replaced all the SMSs on my phone.

Listing

I thought that what I would do is that I’d post it up here. It’s not that elegant, but it’s good enough to share and for a bit of a post mortem.

Anyway, here’s the code (hosted with my gists):

https://gist.github.com/GaProgMan/aede6a109356a7d0e69f

What I Could Have Done Better

RemoveDuplicates returns an array which is fine, but that means that the final line of RemoveDuplicates needs to convert from an IList<Sms> (which implements IEnumberable) to an array before returning. This isn’t a huge issue, but it does increase the memory footprint of the code.

It also means that the following code:

has to convert the array that was returned from RemoveDuplcicates into an IOrderedEnumerable (to do the ordering), then back into an Array (to replace the contents of the Smses objects Sms object array.

This is boxing and unboxing, which is not that big a problem for my particular app because I’m running on a very modern, powerful PC. But if this was being run on an embedded system or one with low memory, then boxing and unboxing could cause memory issues.

Also, there’s a chance that a null ref could be thrown during any of the boxing or unboxing steps (if RemoveDuplicates returns a null array, then the grouping will break; or if the Grouping returns null, then the ToArray will break).

Plus, there would be a speed boost by not boxing and unboxing. Explicitly converting from one type to another is slow and produces a (frankly massive) object. Again, because the code was run on a modern, powerful PC so this wasn’t a huge problem; but something to consider for later revisions.

There are other things that I could have done better, but I’m pretty happy with it as it stands.

My Own Columbo Moment

One last thing:

  • Before I ran my SMS XML through this console app, it was 8 MB and had over 25,000 messages.
  • Afterwards, the rendered XML SMS was around 2 MB and had just under 10,000 messages.

I call that a win.

Related Post

Jamie is a .NET developer specialising in ASP.NET MVC websites and services, with a background in WinForms and Games Development. When not programming using .NET, he is either learning about .NET Core (and usually building something cross platform with it), speaking Japanese to anyone who'll listen, learning about languages, writing for this blog, or writing for a blog about Retro Gaming (which he runs with his brother)