Dump wechat messages from android

Yuxin Wu 5320456e80 update há 13 horas atrás
legacy bb44745bb4 move common/ into wechat/ há 4 anos atrás
screenshots 7ee5312783 add screenshots há 10 anos atrás
third-party 5320456e80 update há 13 horas atrás
wechat 5320456e80 update há 13 horas atrás
.gitattributes e84fc52eeb add gitattr há 9 anos atrás
.gitignore 99c5ed6d33 fix old avatar storage há 4 anos atrás
LICENSE.txt 93c1eecc23 add license há 10 anos atrás
README.md 5320456e80 update há 13 horas atrás
android-interact.sh 5320456e80 update há 13 horas atrás
count-message.sh d5c097277d update docs há 7 anos atrás
decrypt-db.py 5320456e80 update há 13 horas atrás
dump-audio.py 5f729e4227 support both chatid and nickname (fix #51) há 4 anos atrás
dump-html.py 0bd5a7d6f5 update docs about sfs/avatar.index há 3 anos atrás
dump-msg.py bb44745bb4 move common/ into wechat/ há 4 anos atrás
emoji-cache-tool.py 476bdd5971 rewrite with python3 há 4 anos atrás
list-chats.py 476bdd5971 rewrite with python3 há 4 anos atrás
plot-num-msg-by-time.py 99c5ed6d33 fix old avatar storage há 4 anos atrás
requirements.txt 1009d8fcb2 Update requirements.txt (#82) há 3 anos atrás

README.md

Dump WeChat Messages from Android

导出安卓微信消息记录

WeChat, as the most popular mobile IM app in China, doesn't provide any methods to export structured message history.

We reverse-engineered the storage protocol of WeChat messages, and provide this tool to decrypt and parse WeChat messages on a rooted android phone. It can also render the messages into self-contained html files including voice messages, images, emojis, videos, etc.

The tool is last verified to work with latest version of wechat on 2025/01/01. If the tool works for you, please take a moment to add your phone/OS to the wiki.

How to use:

Dependencies:

  • adb and rooted android phone connected to a Linux/Mac OSX/Win10+Bash.
  • Python >= 3.8
  • sqlcipher >= 4.1
  • sox (command line tools)
  • Silk audio decoder (included; build it with ./third-party/compile_silk.sh)
  • Other python dependencies: pip install -r requirements.txt.

Get Necessary Data:

  1. Pull database file and (for older wechat versions) avatar index:
    • Automatic: ./android-interact.sh db. It may use an incorrect userid.
    • Manual:
    • Figure out your ${userid} by inspecting the contents of /data/data/com.tencent.mm/MicroMsg on the root filesystem of the device. It should be a 32-character-long name consisting of hexadecimal digits.
    • Get /data/data/com.tencent.mm/MicroMsg/${userid}/EnMicroMsg.db from the device.
  2. Decrypt database file:

    • Automatic: ./decrypt-db.py decrypt --input EnMicroMsg.db
    • Manual:
    • Get WeChat uin (an integer), possible ways are:
      • ./decrypt-db.py uin, which looks for uin in /data/data/com.tencent.mm/shared_prefs/
      • Login to web wechat, get wxuin=1234567 from document.cookie
    • Get your device id (a positive integer), possible ways are:
      • ./decrypt-db.py imei implements some ways to find device id.
      • Call *#06# on your phone
      • Find IMEI in system settings
    • Decrypt database with combination of uin and device id:

      ./decrypt-db.py decrypt --input EnMicroMsg.db --imei <device id> --uin <uin>
      

    NOTE: you may need to try different ways to get device id and find one that can decrypt the database. Some phones may have multiple IMEIs, you may need to try them all. See #33. The command will dump decrypted database at EnMicroMsg.db.decrypted.

If the above decryption doesn't work, you can also try the password cracker to brute-force the key. The encryption key is not very strong.

  1. Copy the WeChat user resource directory /data/data/com.tencent.mm/MicroMsg/${userid}/{avatar,emoji,image2,sfs,video,voice2} from the phone to the resource directory:

    • ./android-interact.sh res
    • Change RES_DIR in the script if the location of these directories is different on your phone. For older version of wechat, the directory may be /mnt/sdcard/tencent/MicroMsg/
    • This can take a while. It can be faster to first archive it with tar with or without compression, and then copy the archive, busybox tar is recommended as the Android system's tar may choke on long paths.
    • In the end, we need a resource directory with the following subdir: avatar,emoji,image2,sfs,video,voice2.
  2. (Optional) Download the emoji cache from here and decompress it under wechat-dump. This will avoid downloading too many emojis during rendering.

    wget -c https://github.com/ppwwyyxx/wechat-dump/releases/download/0.1/emoji.cache.tar.bz2
    tar xf emoji.cache.tar.bz2
    

Run:

  • Parse and dump text messages of every chat (requires decrypted database):

    ./dump-msg.py decrypted.db output_dir
    
  • List all chats (required decrypted database):

    ./list-chats.py decrypted.db
    
  • Generate statistics report on text messages (requires output_dir from ./dump-msg.py):

    ./count-message.sh output_dir
    
  • Dump messages of one contact to html, containing voice messages, emojis, and images (requires decrypted database and resource):

    ./dump-html.py "<contact_display_name>"
    

    The output file is output.html.

    Check ./dump-html.py -h to use different paths.

Examples:

Screenshots of generated html:

byvoid

See here for an example html.

TODO List (help needed!)

  • IMPORTANT Some emojis and chat images are stored in a proprietary "wxgf" format. We don't yet know how to decode this format.
  • Fix rare unhandled message types: > 10000 and < 0
  • Better user experiences... see grep 'TODO' wechat -R

Donate!

[paypal]