I created a quick and dirty Python script to automate the extraction of media files from a Skype exported archive.
Skype allows you to download all your conversation history and files that were shared with you or that you shared. The detailed instruction to export your data and to extract the .tar file is available on this support page. Or you can go directly to Skype export page and submit a request there.
The message archive format is json and Skype provides a parser to view the messages in a more readable format for many people. There are still a few problems that I have with the archive:
- The message viewer still does not display media files. It displays media files (pictures, videos, etc.) as XML-like format that is not very user friendly. You will still have to find the actual media files in the media folder included in the exported archive.
- These media files are renamed with a random string of characters and the only way to link them to the conversation is by looking at the XML-like code in the message content.
- Worse still, some of the files do not retain the original extension of the files. For example, all the video files (whose extension was supposed to be mp4) are renamed with extension .1. There’s also another file with the same filename and extension .2 and this is the thumbnail picture of the video file.
- The file creation/modification time of these files are not set correctly so there’s no way to sort these media files by the date and time they were shared.
- There’s no easy way to sort these files by the person who shared them.
My main purpose for downloading the archive was to save all the media files that have been shared with me. Finding all the media files and renaming them accordingly will be a tedious task since I have more than 3GB of data. So I created a Python script to automate this task.
The Python script provided via the download link near the end of this article will do the following:
- Save all pictures, videos and other media files from the original media folder in the archive into a new exported folder.
- Create subfolders within the exported folder for each Skype conversation partner, and save the media files belonging to that partner inside.
- Rename the media files to the original name of the file if possible.
- E.g. instead of 0-wae-d6-039fbbc6d5cac93b09467d30b8c0086e.1.PNG, it will rename it to mypicture.PNG if the original filename shared was mypicture.PNG.
- Sometimes the original filename is not available. In this case, the script will save it with the existing filename and change the extension according to the file type.
- Change the media files modification date to the timestamp of the conversation. This way you can sort them by the date and time they were shared in the conversation.
Some of the files that are referenced in the conversation are not included in the media folder. These files are no longer available and Skype has deleted them from their server (that’s why they’re not included in your exported archive). See this Skype support page about file sharing limit.
How to use this Python script?
- Python 3. Please visit www.python.org to download Python 3 and install it if you don’t already have it.
- The script uses dateutil.parser library to parse date and time. Install it using pip install python-dateutil if you don’t already have it.
- Put the Python script file (skype-media-saver.py) in the same folder as the file messages.json from the archive you download from Skype. Make sure the accompanying media folder is also there.
- Run the script as you normally run your Python script.
Note: The script will print out some informative messages during execution when it encounters files with the same name. It will tell you whether it will skip that file because the existing file is identical or it will save it as a new version if it is a different file than the one that already existed.
License: GNU GPLv3
Although this script works pretty well for me, it might not work for all situations. For example, if your Skype conversation includes a certain file format or content that is not covered by the script, it may not be able to save those files. I hope this script can be useful to others. Let me know in the comment if you have any suggestions to improve it.