My mobile phone has SMS messages that go as far back as my first smart phone (around 2009). Each time that I upgraded, I found ways to transport them to my new phone.
Why? Well, why not? It’s always good to look back through my old messages and reminisce over stupid conversations.
In the case of a few threads, these include folks I’ve lost contact with; and a bunch with a friend who has passed away
Luckily, I have a bunch of apps (including this one) that run a bunch of backups EVERYDAY at midday. The SMS backup app that I use takes a full backup of my SMS messages,
Yes, I still use SMS to communicate with some friends
stores them in an XML file and uploads that file to my DropBox account.
Since this happens every day, all I had to do was pull down the previous day’s message backup and restore it onto my phone.
This is where it went a little wobbly.
For some reason during the restoration process, my phone crashed and rebooted. This happened a few times (I was doing too many restorations of different things [WhatsApp, HangOuts, etc.] and caused a Kernel Panic).
A few days later, I’d noticed that the restoration process had restored around 3-5 copies of each of my SMS messages. This was no good as I ended up having lots of superfluous messages to trawl through to find an older one.
Since it had been a few days (a few backups had been taken, and a load of messages received), I couldn’t restore the backup from when my phone crashed because my message threads would be out of date.
I COULD have manually deleted each of the repeated messages, but I did a total count of the messages and it came out at 25,000
I think that the second word out of my mouth was “that”
I even took to Reddit to ask if anyone knew of an app that I could use to do bulk remove the duplicate apps, but there were no suggestions.
Being a supremely lazy person, and apparently all good programmers are, I thought that there must be a better way to do it.
Why not write a program to go through the latest backup file and remove all duplicate entries? It wouldn’t have to be beautiful, or have a user interface. It also wouldn’t need to have any command line interface. Just dump the XML file in the same folder as the binary, run it, and ‘let her rip’.
Half an hour later I’d written the program, tested it, generated an XML and replaced all the SMSs on my phone.
I thought that what I would do is that I’d post it up here. It’s not that elegant, but it’s good enough to share and for a bit of a post mortem.
Anyway, here’s the code (hosted with my gists):
What I Could Have Done Better
RemoveDuplicates returns an array which is fine, but that means that the final line of RemoveDuplicates needs to convert from an IList<Sms> (which implements IEnumberable) to an array before returning. This isn’t a huge issue, but it does increase the memory footprint of the code.
It also means that the following code:
var nonDuplicated =
RemoveDuplicates(smses.Sms.GroupBy(sm => sm.address))
.OrderBy(mes => mes.date).ToArray();
// excuse the escaped HTML angle brackets
// lines 28 - 31
has to convert the array that was returned from RemoveDuplcicates into an IOrderedEnumerable (to do the ordering), then back into an Array (to replace the contents of the Smses objects Sms object array.
This is boxing and unboxing, which is not that big a problem for my particular app because I’m running on a very modern, powerful PC. But if this was being run on an embedded system or one with low memory, then boxing and unboxing could cause memory issues.
Also, there’s a chance that a null ref could be thrown during any of the boxing or unboxing steps (if RemoveDuplicates returns a null array, then the grouping will break; or if the Grouping returns null, then the ToArray will break).
Plus, there would be a speed boost by not boxing and unboxing. Explicitly converting from one type to another is slow and produces a (frankly massive) object. Again, because the code was run on a modern, powerful PC so this wasn’t a huge problem; but something to consider for later revisions.
There are other things that I could have done better, but I’m pretty happy with it as it stands.
My Own Columbo Moment
One last thing:
- Before I ran my SMS XML through this console app, it was 8 MB and had over 25,000 messages.
- Afterwards, the rendered XML SMS was around 2 MB and had just under 10,000 messages.
I call that a win.