-
-
Notifications
You must be signed in to change notification settings - Fork 3.2k
[SoundCloud] Fix soundcloud regression of expiring HLS streams after 5mins #12418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
[SoundCloud] Fix soundcloud regression of expiring HLS streams after 5mins #12418
Conversation
Improve logging in InfoCache.java and elsewhere Log state name in some methods Minor refactoring
…non YouTube streams
…rate building Hls and Progressive streams Add HlsAudioStream to facilitate refreshing expired Hls playlists Implement refreshing expired hls playlists in RefreshableHlsHttpDataSource
Remove read override as not needed anymore Remove some TODOs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just had a quick look and you may want to address the Todo
comments before handing this over to review ;)
There is also a lot of stuff inside there that does not clearly belong to the problem, like logging, which should be removed.
@litetex Those TODOs are intentionally left there because I want answers to those questions. If you know the answer, please answer it and then I will remove the TODOs once they are resolved. I do not agree that the logging should be removed: I added logging statements in a lot of places to help debug and fix this issue in the first place, and logging doesn't really add/remove functionality so I don't see any point in removing it and having to manually copy paste those lines into another PR when they can just be part of this PR. On top of that, Unless you think the extra stuff warrants a lot of effort to review, I don't really think it's worth the time extracting it out into another PR. |
What is it?
Description of the changes in your PR
Fixes the following issue(s)
Relies on the following changes (PLEASE REVIEW THIS FIRST)
APK testing
The APK can be found by going to the "Checks" tab below the title. On the left pane, click on "CI", scroll down to "artifacts" and click "app" to download the zip file which contains the debug APK of this PR. You can find more info and a video demonstration on this wiki page.
Due diligence
Testing
Until we add tests, the simplest way to test this is to:
If you get no error then it means it's working. You can also check Logcat and Ctrl+F for "refreshPlaylist" which should show up on step 4.
General rundown of the fix
As detailed in #12109, the cause of the bug is due to the HLS CDN urls expiring after 5 minutes.
Previously we would only retrieve Progressive MP3 streams which basically just downloaded the whole track outright so there was no opportunity for anything to expire since the full track gets buffered way quicker than 5 minutes.
The way we extract SoundCloud tracks in
SoundcloudStreamExtractor
is we call the API to get the JSON information for atrack
. The part of thetrack
that has information for the actual audio track istranscodings
.Here's an example JSON object for the track https://soundcloud.com/jaronsteele/as-the-world-gets-smaller:
Example track JSON
transcodings
is found intrack.media.transcodings
Here is the
transcodings
array for thistrack
:Example transcodings JSON
Each
transcoding
in the array has aurl
, and we call that endpoint to get the direct CDN stream url we need to get the actual binary data which will be played by ExoPlayer.The CDN urls for each transcoding all have a different format for the base path of the URL, but they share some query parameters.
Example Progressive MP3 URL
https://cf-media.sndcdn.com/jPI4kiRuqQ8N.128.mp3?Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiKjovL2NmLW1lZGlhLnNuZGNkbi5jb20valBJNGtpUnVxUThOLjEyOC5tcDMqIiwiQ29uZGl0aW9uIjp7IkRhdGVMZXNzVGhhbiI6eyJBV1M6RXBvY2hUaW1lIjoxNzUwODk1ODU5fX19XX0_&Signature=dhm-WTxVYck7kCeqQpYS6LHblPqtwseUkzZ4hKjUnRnOQGwe~6hbgLMW~TE2l8QEiGlUFBMS4LsuDTH8sBCZzyjJ0vdOdHucphg9se-P8ZCUCjxVOfI16DLAMlb3KfSAkyeqZUWpuRf0Zq9AmdNRhBBKiexycruaQCYGMK~Qe9HaCCXWTYZamHSogitnif~r5ga5jeZs23FU30al6RzeKN64pwMnYdi1pnVixEykaQ5b4Zg6hLRZp~a7gqFqqyBX8PESzH2hJkV1rphNtnHIl0C1vgfxj9hltkrNhQDri~OD4e4a~nqTmnAkUFqSuCkVXWKyfIgICoMMX2~jdpoquQ__&Key-Pair-Id=APKAI6TU7MMXM5DG6EPQ
Example HLS MP3 URL
https://cf-hls-media.sndcdn.com/playlist/jPI4kiRuqQ8N.128.mp3/playlist.m3u8?Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiKjovL2NmLWhscy1tZWRpYS5zbmRjZG4uY29tL3BsYXlsaXN0L2pQSTRraVJ1cVE4Ti4xMjgubXAzL3BsYXlsaXN0Lm0zdTgqIiwiQ29uZGl0aW9uIjp7IkRhdGVMZXNzVGhhbiI6eyJBV1M6RXBvY2hUaW1lIjoxNzUwODk0OTE4fX19XX0_&Signature=BRwoetXgqu0HndAziOtz48aKzwH1uc07ruDeAECzSKGHN5XK2s20E4hQaLR4PzCix5AZn-GKsU5Myqz~wp2uhIWHKXfyhn4aQNTBxIIREfwR9wGTNKVcSA5IGjtmjJF37uVuAkxSwPSEg54I9MB6MvftSy4P5twTLEj~x3xfX9k6mxIaqoBqMP5TuuLFRqRnIBa~PEK~IfY~SKWH5swv9ZSQgbKSlm1bznb66SI189wMes1n1Z1UxaVEXn2-PCFHBPkaH--yFd-8U4QltBbAcyLllZzpp3kIxq9BNd2ff-k8p6pilnntTerMFuzg37PQkTw8-SKcApiqoDsr-Xhmng__&Key-Pair-Id=APKAI6TU7MMXM5DG6EPQ
Example HLS AAC URL
https://playback.media-streaming.soundcloud.cloud/jPI4kiRuqQ8N/aac_160k/67cae8d9-7363-4aff-8003-2e905b315f86/playlist.m3u8?expires=1750894776&Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiaHR0cHM6Ly9wbGF5YmFjay5tZWRpYS1zdHJlYW1pbmcuc291bmRjbG91ZC5jbG91ZC9qUEk0a2lSdXFROE4vYWFjXzE2MGsvNjdjYWU4ZDktNzM2My00YWZmLTgwMDMtMmU5MDViMzE1Zjg2L3BsYXlsaXN0Lm0zdTg~ZXhwaXJlcz0xNzUwODk0Nzc2IiwiQ29uZGl0aW9uIjp7IkRhdGVMZXNzVGhhbiI6eyJBV1M6RXBvY2hUaW1lIjoxNzUwODk0Nzc2fX19XX0_&Signature=hYW9T6gng6ixxLnbhmGRpc-A2DDn3E~hamnzcOL893tkmRmlbirQElg9M40ylxJzIWGggLRMNadjBJPuvKU5mL59sWPzsgSKmS1WlPBjrs4ZCblSZBH2Y~XOPCJfZE2DH03O1DzNqY48PaTC2CO~n5I2h4mj0gTRDPRFMZz4xVYOqnOdAVCYdT3gXlSLVoUfR4WR6EiCNVC247uSb5Qed3C1f8FHbtrYzJ03EMoN6LZ0rW8yHMgsEIRvWAH408iRb-isjOs3hPHJMyiz8nF4ImnmxTrIY3DlKuoPbMjuEdkFlD5eoRT36oxRV8HOBOulsw~rtvidItq09kzxxoTDGA__&Key-Pair-Id=K34606QXLEIRF3
Example HLS OPUS URL
https://cf-hls-opus-media.sndcdn.com/playlist/d9110040-fe52-4adb-84f0-7d672d7ab077.64.opus/playlist.m3u8?Policy=eyJTdGF0ZW1lbnQiOlt7IlJlc291cmNlIjoiKjovL2NmLWhscy1vcHVzLW1lZGlhLnNuZGNkbi5jb20vcGxheWxpc3QvZDkxMTAwNDAtZmU1Mi00YWRiLTg0ZjAtN2Q2NzJkN2FiMDc3LjY0Lm9wdXMvcGxheWxpc3QubTN1OCoiLCJDb25kaXRpb24iOnsiRGF0ZUxlc3NUaGFuIjp7IkFXUzpFcG9jaFRpbWUiOjE3NTA5MTY3MTB9fX1dfQ__&Signature=clc5bSYK96w2VqMHraWqRtjOPEfMMwRJA1U1vCDRbfcwkOdrlMcoxCnW32uShymd0XsSmy0xdcgIASqE1xqjlGfa-aD~~3YEpoFYsiOj3rKPCQf2muowIe4YEj~yUzYm~8ktGA7epSQwLN~oZ5ER6H8vH4vVZP-CNQprNMMZGu0vNQj8TQH2-y0wyKQaN0GgKe1sGOmdyNpWnKAAtqIQvCaC7AoSwSqqMI0HP1rlCUCDtlLEPaXSjVih9SLCjAgNtAA1QTgNhRIfuuJn1AG4iMD4af6mJr5Lskrx90s6OKWcVoJY8Q9KpZo2lCwCK4AR4C7pynz5CqlIThTCBihaCw__&Key-Pair-Id=APKAI6TU7MMXM5DG6EPQ
If you look closely you can see they all share three query parameters:
Policy
This is a base64-encoded JSON object which contains information about the access rules for the resource. Here is one decoded:
DateLessThan
is the expiry date as a Unix epoch timestamp. This is always 5 minutes ahead of when the url is called.In the case of AAC HLS streams, there is also an
expires
query parameter in the URL, and it isidentical to the value in thePolicy
JSON.Signature
This is a cryptographic string used to validate the authenticity of the request. It’s essentially a HMAC or RSA signature applied to the Policy and/or the request URI using a private key. When the SoundCloud CDN receives the request, it uses the corresponding public key to verify the signature hasn’t been tampered with and that the request is within the allowed policy bounds.
If the
Signature
is incorrect, expired, or doesn’t match the policy, the CDN will reject the request with a 403 Forbidden.Key-Pair-Id
This is an identifier for the key pair used to sign the
Policy
. It tells the CDN which public key to use when verifying the signature. In SoundCloud’s case, the ID corresponds to one of their internal key sets.The problem
Let
apiStreamUrl
be the URL intranscoding.url
, and letcontentUrl
be the actual CDN stream URL returned from callingapiStreamUrl
.Every time you call
apiStreamUrl
, it returns the same basecontentUrl
, but with a different Policy, Signature, and (for AAC) a differentexpires
parameter.For HLS streams,
contentUrl
is the m3u8 playlist.Currently our code only calls
apiStreamUrl
once to get the m3u8 playlist for HLS streams, and that stays the same for the lifetime of theAudioStream
object. The way m3u8 playlists work is that you fetch the m3u8 playlist fromcontentUrl
which has the urls for each chunk of the playlist.Here is an example:
Example M3U8 playlist
We pass
contentUrl
to ExoPlayer, and it fetches this playlist, parses it, and sequentially gets each chunk for playback (it will usually buffer like 10 chunks ahead of the current position).When NewPipe extracts a SoundCloud stream, it stores it in a StreamInfo. it will also extract the track before and after it for seamless playback once the current song ends. Every time something is extracted it gets put into a cache, and SoundCloud streams have a cache expiry of 5 minutes.
The problem is we only get this playlist once, so the chunk urls are fixed and expire after 5 minutes. This means if you have tracks A, B and C, and you play track B, depending on the length of the tracks, or even if you just pause for more than 5 minutes, ExoPlayer will try to fetch a chunk after it's expired, get a 403 Forbidden, and playback stops and skips to the next track.
Our error handling doesn't handle errors well so it does not retry the track and just skips to the next song. (Reference Player.Java onPlayerError here)
ExoPlayer has no built-in support for refreshing expired HLS playlists. Calling
apiStreamUrl
returns a newcontentUrl
(i.e., a playlist URL) every time, and even callingcontentUrl
will return a fresh m3u8 playlist with updated URLs for each playlist segment; butcontentUrl
itself expires after 5 minutes.There's no way to tell ExoPlayer to call
apiStreamUrl
and then callcontentUrl
, parse the playlist and update it's internalHlsMediaPlaylist
after it is initially created. Even though we passcontentUrl
to ExoPlayer for it to parse initially, it only calls it once and never calls it again, so the playlist never gets updated and expires after 5mins because thePolicy
andSignature
are no longer valid.The Solution
I initially tried to find a solution within ExoPlayer that wouldn’t require changing much NewPipe code, but ExoPlayer exposes no API refresh playlists once they're created. The only way to do it would be to modify the ExoPlayer source code and use our own custom package, but that's too much (and would still be too complicated as well)
So I used a custom
HttpDataSource
:RefreshableHlsHttpDataSource
.How it works
403
, it means thecontentUrl
has expired, and all the chunk URLs in the playlist have also expired.apiStreamUrl
to get a brand newcontentUrl
, which points to a fresh playlist with valid chunk URLs and updatedPolicy
,Signature
, etc.Other solutions
I spent a lot of time looking for the simplest solution and investigated several approaches.
The ideal solution I was looking for was to replicate browser behaviour: when retrieving a chunk for a track returns
403 Forbidden
, reload the entire HLS playlist, get the same chunk, and continue playing the track as normal.As stated already, ExoPlayer has no mechanism for this out of the box, and it doesn't expose any way to refresh it's internal
HlsMediaPlaylist
once it gets created. Since this is a problem related to ExoPlayer's lack of functionality to fix this problem, and not necessary related to Player code, I wanted to find a solution that requires minimal changes to Player code and architecture (because the solution wouldn't need to do that if ExoPlayer had an API for refreshing expired HLS playlists).The main 3 approaches I investigated were:
DefaultHttpDataSource
,DefaultHlsPlaylistTracker
,HlsMediaSource
etc., kinda like we have forYoutubeHttpDataSource
which is a custom version ofDefaultHttpDataSource
.HttpDataSource
, then implement solution at that level using a customHttpDataSource
Why Solution 1 is infeasible
The internal ExoPlayer code has the initialization and parsing of the HLS playlist spread out through several classes. There's no simple central place I could inject the code to be able to easily refresh the playlist and continue playback as normal. Probably the most practical way to do it would be with reflection, but that is too hacky of a solution and I wanted something cleaner, simple and stable.
Regardless, the only way to know when the playlist expires is from within
HttpDataSource#open
, so there would still need to be some wiring from within aHttpDataSource
via a callback or similar to trigger some code somewhere else that would refresh the playlist, which would introduce coupling with Player code that I'd much rather avoid.Why Solution 3 is infeasible
Since we can only know when playlist has expired from within
HttpDataSource#open
, any code that wants to react to that happening needs to be triggered from within that method.This is because ExoPlayer buffers chunks ahead of time, so it requests expired chunks while it is playing the chunks it has already buffered. It will continue getting 403s for expired chunks up until the playback position reaches the timestamp of that expired chunk, and then it will throw an error and call
Player#onPlayerError
.If we react to
onPlayerError
instead ofHttpDataSource#open
reloading the playlist, then playback would be stopped for the entire time needed to fetch, parse and load the new playlist, on top of the pause caused by throwing the error. So it makes sense to do it beforehand at the first instance we get a 403 (which is what the browser does).Therefore, a solution that would load a new media source into ExoPlayer would need to be triggered from within
HttpDataSource#open
.A top level of solution at the NewPipe Player level would look like this:
open
gets a 403, which triggers a callbackWe would want to ensure all buffered content gets played because: 1. We don't want to waste that data, and 2. We want to fetch the new HLS playlist while content is playing; otherwise there will be a gap in playback (which the browser doesn't have).
However, ExoPlayer has no way to "hot swap" a
MediaSource
, and especially not with gapless playback either. Loading a new media source also discards the buffer. The only way to do this would be via polling to checkplayer.getCurrentPosition()
and compare withplayer.getBufferedPosition()
, but somehow stop playback before it reaches the end of the buffer (because it will throw an error otherwise) and then load in the new media source; but then that would prevent seamless playback because we wouldn't play til the end of the buffer, and also we wouldn't be loading while the media playing.The more you delve into it the more complicated it gets, and it would also require changing Player code, because the code currently maps the index of streams in
MediaSourceManager.playQueue
to their index of the respectiveMediaSource
in the internalConcatenatingMediaSource
playlist inMediaSourceManager.playlist
. So adding a new media source to the playlist would invalidate a lot of logic in the code (e.g. 3 != 4 because we'd wantplayQueue
index 3 to == 3, 4, and even == 5 inplaylist
, if that makes sense) and so a lot of code would need to be changed, and the only way to do it in a clean way would be change the architecture and refactor a bunch of stuff which would be convoluted and non-trivial.So for these reasons, I abandoned this idea.
Why Solution 2 makes the most sense
This problem is inherently a network issue and so it should be solved at the network level where the error is occurring. If ExoPlayer handled refreshing expired playlists, there wouldn't be any need to change any of our own code: we would just use whatever API was available (like providing a callback to get a new playlist URL or something).
Given this, it made the most sense to implement a solution that maps closest to this ideal scenario that requires minimal changes to Player code.
Due to how ExoPlayer is structured, the most appropriate place to do this is within
HttpDataSource#open
, as that is what requests the chunks.Although there's no way to replace ExoPlayer's internal HLS playlist, we can fetch a new playlist from within the
HttpDataSource
when the internal one expires, and from then on open chunks from the new playlist. This replicates browser behaviour, and has the benefit that the only code we need to add/change is code that is concerned with data extraction.