#+title: how i pirate audiobooks
#+date: 2024-11-02

* why

dont want to pay for Audible dot com, or any similar service. reading on a cheap e-ink tablet is probably better health wise but i like to multitask. pirated audiobooks are the way to go, and as we'll soon see, they're remarkably easy to obtain :3

* how

*we do not hack Amazon*. somebody already did the piracy part, we're basically just piggybacking off of their work. Everything I'm about to show is generalizable to lots of audiobooks, and to see which exist you'll just have to find them yourself.

1. Search

there are a myriad of sites which host ripped Audible mp3s, they come and go, their names are not important. by the time you're reading this, the one i select may have already been supplanted. but /it doesn't matter, because once you have it you have it/.

- Go to your preferred illicit search engine - i use my own searxng because it exists, and that's public if you want - but yandex works just as well or better.

- the book I'll be downloading for this exercise is the excellent novelization of _Star Wars - Revenge of the Sith_ written by Matthew Stover. So I'll just put =revenge of the sith audiobook matthew stover= in the search bar.

- as of writing, the first result is Audible, the second is Amazon, and the third is Apple Books. the fourth is a domain called "staraudiobook.net" which has embedded audio players. a few seconds' listen to the first track confirms it's in English and the correct property. watch out! especially if you found one of these sites thru yandex, they may be Russian-language. which is great if you want to learn Russian by listening to audiobooks, not so much for an english-main speaker to casually listen.

2. Scrape

open your shell. I'm using zsh, but it should be identical in any posix-superset shell, and trivially translatable to any other environment. a quick =curl -s URL | grep mp3= is a pretty simple start and pretty much shows each file we're going to privateer:

#+begin_example

<audio class="wp-audio-shortcode" id="audio-63-1" preload="none" style="width: 100%;" controls="controls"><source type="audio/mpeg" src="https://ipaudio5.com/wp-content/uploads/STARR/star/Revenge%20Of%20The%20Sith/0.mp3?_=1" /><a href="https://ipaudio5.com/wp-content/uploads/STARR/star/Revenge%20Of%20The%20Sith/0.mp3">https://ipaudio5.com/wp-content/uploads/STARR/star/Revenge%20Of%20The%20Sith/0.mp3</a></audio>

<audio class="wp-audio-shortcode" id="audio-63-2" preload="none" style="width: 100%;" controls="controls"><source type="audio/mpeg" src="https://ipaudio5.com/wp-content/uploads/STARR/star/Revenge%20Of%20The%20Sith/1.mp3?_=2" /><a href="https://ipaudio5.com/wp-content/uploads/STARR/star/Revenge%20Of%20The%20Sith/1.mp3">https://ipaudio5.com/wp-content/uploads/STARR/star/Revenge%20Of%20The%20Sith/1.mp3</a></audio>

#+end_example

(this is but a subset, to show the pattern).

from here, it's a regex game. Using =grep='s -E (extended regexp) and -o (only the matching parts of each line), we can devise something like

#+begin_src bash

curl -s 'https://staraudiobook.net/revenge-of-the-sith-audiobook/' | grep -Eo '>https.*mp3<'

#+end_src

#+RESULTS:

: >https://ipaudio5.com/wp-content/uploads/STARR/star/Revenge%20Of%20The%20Sith/0.mp3<

: >https://ipaudio5.com/wp-content/uploads/STARR/star/Revenge%20Of%20The%20Sith/1.mp3<

: >https://ipaudio5.com/wp-content/uploads/STARR/star/Revenge%20Of%20The%20Sith/2.mp3<

: >https://ipaudio5.com/wp-content/uploads/STARR/star/Revenge%20Of%20The%20Sith/3.mp3<

: >https://ipaudio5.com/wp-content/uploads/STARR/star/Revenge%20Of%20The%20Sith/4.mp3<

: >https://ipaudio5.com/wp-content/uploads/STARR/star/Revenge%20Of%20The%20Sith/5.mp3<

: >https://ipaudio5.com/wp-content/uploads/STARR/star/Revenge%20Of%20The%20Sith/6.mp3<

from which we only need to pipe thru =tr -d '><'= to strip off those angle brackets - and we have a nice list of audio files.

finally, wrap that whole pipeline in =$()= as an argument to wget like so:

#+begin_src bash

wget $(curl -s 'https://staraudiobook.net/revenge-of-the-sith-audiobook/' | grep -Eo '>https.*mp3<')

#+end_src

and it should download them all in order sequentially.

this process is generalizable to any audiobook afaik - let me know if it doesn't work for obscure stuff tho! can reach me on the fediverse at @mir@talk.marq42.xyz