[Solved] Help needed with extracting the audio from an iOS app

quandoquando@slrpnk.net · edit-2 1 year ago

[Solved] Help needed with extracting the audio from an iOS app

HiDefMusic@feddit.uk · 1 year ago

If the content is being downloaded then you could do the following:

Setup a proxy like Fiddler2 on a PC on the same network your phone is connected to. You’ll need to configure Fiddler2 to decrypt HTTPS and then access the proxy certificate on your iPhone and add it as a trusted certificate. You’ll need to Google how to do this with Fiddler2 but it’s not hard
Configure your iPhone to connect through the Fiddler2 proxy by modifying your WiFi settings. If your PC is allowing connections to the Fiddler2 proxy port (e.g. 8888) and you’re trusted the cert on your iPhone, then connecting to websites in safari should be working
Open the naturespace app on your iPhone and delete the sounds you’re interested in. ONLY do this if you’re sure you can still re-download them. If not, this whole approach won’t work so just stop here.
Download the sounds again

You should then be able to see the web requests that were made to download the sounds in Fiddler2 on your PC. As long as there’s no fancy encryption, you should be able to just save the content out of Fiddler2.

quandoquando@slrpnk.net · 1 year ago

Ah, thank you! See, it didn’t even occur to me to just intercept the audio in transit. That’s really helpful, I will try this, thank you very much :)

HiDefMusic@feddit.uk · 1 year ago

So I’ve reverse-engineered the naturespace Android APK and the files it downloads are definitely encrypted. They’re zip files (named as .nzp) that are XOR obfuscated with a rotating key every X amount of bytes. I haven’t quite worked out how the key rotates itself but I’m close. If I get it working I’ll put the details here and I can give you a Python script to grab whatever sounds you need.

quandoquando@slrpnk.net · edit-2 1 year ago

Ha, I was just writing an update when your comment came.

I followed your advice and installed mitmproxy (basically fiddler2 but open source), which was easy enough, and managed to find that the app just posts GET requests the homepage, which result in a 302 Temporarily Moved, which ends on a public S3 folder.

The GET request includes some “ID”, which I’m not sure I should post publicly, maybe it might identify me? It’s like:

GET http://www.naturespace.com/ns5ios/?command=download&path=%2Fmedia%2Fmodules%2Fcom.HolographicAudioTheater.Naturespace.Aegir&lang=en&id=REDACTED&bvrs=5.15&sysv=16.5&model=iPhone&bid=com.HolographicAudioTheater.Naturespace&sys=iOS&loc=en_DE HTTP/1.1

But yes, it seems the files are encrypted. I couldn’t find anything to open them, and no file identifier knows what it is. If you manage to get somewhere, that’d be awesome, my tech knowledge definitely ends here lol.

I guess it’s not actually illegal to post this, since it really is just a public folder, so if anyone els wants to look at it, here’s a file.

HiDefMusic@feddit.uk · edit-2 1 year ago

Interesting, looks like they might be using a completely different file format for iOS versus Android. In any case, I’ve knocked up a script which will extract the track.ogg file from any pack of your choosing. Pasting directly here to see if it works (haven’t tried sharing code on Lemmy).

You can browse available packs using the below URL. If you want to find out a pack name, just copy the banner image URL for it and you’ll see the “com.whatever” name in the URL itself.

http://www.naturespace.com/android/v3/store/?live=true&udid=0

Code:

import sys
import requests
import hashlib
import io
import zipfile

ns_baseurl = "https://s3.amazonaws.com/naturespace/kindle_catalog/"

# Encryption key
key = b'DE2#We@(# sffsFSHfssfhsSFHSs_+&Gaa s,W.Z./lSFGSDF! NOWG!fjasdflasdkfjSADFKJASdflskgj fdkaG8HS42dncuFFSe=-56a'

def decryptNS(content):
	x = 1025
	y = 0
	dec = bytearray()
	for i in range(x,len(content)):
		if ((i+1024) % 1024) != 0:
			dec += bytes([content[i] ^ key[y % len(key)]])
			y = y+1

	return dec

if __name__ == '__main__':
	if len(sys.argv) < 2:
		print("Please provide a pack/module name (e.g. 'com.HolographicAudioTheater.Naturespace.TheImaginarium')")
		sys.exit(0)

	pack = sys.argv[1]
	json_url = ns_baseurl + pack + "/data.json"
	size = requests.get(json_url).json()["packageSize"]
	print(size)

	hashval = hashlib.sha1((pack + "8DvJx25sXwNwq2p" + size).encode()).hexdigest()
	dlurl = ns_baseurl + pack + "/" + hashval + "/" + pack + ".nzp"
	print(dlurl)

	content = decryptNS(requests.get(dlurl).content)
	"""
	with open(pack + ".zip", "wb") as f:
		f.write(content)
	"""
	zipf = io.BytesIO(content)
	zipfile = zipfile.ZipFile(zipf, 'r')
	track_nsp = zipfile.read('track.nsp')

	track_ogg = decryptNS(track_nsp)
	with open(pack + "_track.ogg", "wb") as f:
		f.write(track_ogg)

quandoquando@slrpnk.net · 1 year ago

:O Wow.

That is just simply amazing. Can confirm it works.

The file names are always the same, prefix and the sounds name, e.g. if the sound is called “The Electric Forest” it’s com.HolographicAudioTheater.Naturespace.TheElectricForest.

I guess I could write a scraper and parse the whole collection.

Well, thank you kind stranger on the internet.

sharpiemarker@feddit.de · 1 year ago

Damn, that’s genuinely impressive. Well done!