Coolstory: Development of an interactive upgrade for Mikulski_Radio (Liquidsoap 2.2.x)

For the last month, I have been actively trying to introduce into the Mikulski_Radio broadcast the ability to remotely connect a microphone to communicate with a chat. Despite the fact that I have already reached a dead end several times, there is still significant progress towards the desired result. And it seems that this idea is being implemented, however, for an indefinite time, until some bugs are fixed in the prerelease version of Liquidsoap.

Let me remind you that the Mikulski_Radio stream is not broadcast from a physical PC with OBS installed, but from a remote Linux-based virtual machine (VPS).
The VPS does not have an audio/video card and all control over the system is carried out via the command line (terminal).
My "OBS" is code written in the Liquidsoap programming language, invented to create radio-like broadcasts.
The current stable version is 2.1.4, the pre-release version is 2.2.x
P.S. I am not a programmer and not a Linuxoid. Superficially, I picked up everything, studying manuals and other people's scripts, solving my own problems.

Liquidsoap provides such an opportunity to remotely connect an audio source using the input.harbor operator. Roughly speaking, it runs its own mini-server, to which you can connect using an Icecast-compatible protocol, specifying the stream name, port and password:

mic = input.harbor("live", port=8000, password="hackme")

To connect, you can use any (usually DJ-like) applications that are able to send a stream to Icecast. For example, Mixxx.
In the process of searching, I came across a VST plugin – ShoutVST, which is much more convenient for me, because I can control the sound of the microphone and what gets into input.harbor directly from the DAW.

The plug-in must be installed on the DAW master bus or in the track with the microphone after all the effects treatments.
The Hostname specifies the VPS address.
Username in Icecast, always source by default.
Next, a password with the name of the stream (Mountpoint) is entered, set in the input.harbor arguments.
Bitrate, sampling rate and format to your own taste, and you can skip everything else, because you won’t need it in this case.

It remains only in the script to register the input.harbor overlay on top of the playlist using the add operator:

videos = mksafe(playlist("~/mp4"))
mic = mksafe(input.harbor("live", port=8000, password="hackme"))
stream = add([mic, videos])

But then the first bummer was waiting. When using video files as a playlist, after connecting to input.harbor, an endless buffer error occurs and the sound from the microphone does not arrive even with glitches on the stream. Attempts to increase the buffer, up to absurd values, did not lead to the expected result, as did the use of the buffer.adaptive operator, as well as other ways to work around this problem.
Making sure everything works fine (even with the smallest possible buffer) when an audio playlist is mixed with an individual video (e.g. looped animation):

music = mksafe(playlist("~/mp3))
background = mksafe(single("~/background.mp4))
mic=mksafe(input.harbor("mount", port=, password="", buffer=1.5, max=3.0))
music = add([mic, music])
stream = mux_video(video=background, music)

I started looking for a way to split the video source into two separate ones – video and audio, to mix input.harbor with audio, and then put it all back together.
But countless attempts led either to non-working builds with synchronization conflicts (the same source cannot belong to two different internal processes/clocks), or to the fact that the stream was started with video, and only input.harbor was accepted as audio.
It is curious that if you use input.rtmp instead of input.harbor (connect from home via OBS and take only audio from this stream), then everything worked as it should, with the exception that the broadcast was frozen tightly after the connected stream in OBS stopped.
The idea arose to duplicate the source and take audio from one copy and video from the second. But the playlist would not fit here: even if you did not make a random order of playback, but a deliberately prescribed list, the first request would break everything.
Therefore, I decided to try to run an additional instance of Liquidsoap, in which I would stream twice via input.rtmp:

stream_video = mksafe(input.rtmp(id="video",listen=false, "rtmp://IP/stream/live"))
stream_audio = mksafe(input.rtmp(id="audio",listen=false, "rtmp://IP/stream/live"))

mic=mksafe(input.harbor("mount", port=, password="", buffer=1.5, max=3.0))
host = add(normalize=false, [mic, stream_audio])

radio = mux_audio(audio=drop_video(host), drop_audio(stream_video))

And it works! However, only some time later I noticed that in this case there is a very noticeable desynchronization of video with sound.

I carried out all these manipulations on the latest stable version of Liquidsoap – 2.1.4. This version is also the latest in the 2.1.x line, and it should be followed by 2.2.0.
In 2.2.x, the concept of “multitrack” was introduced, which implies the ability to pull individual tracks from an audio-video source (for example, if several audio tracks with different voiceovers are stored in a video file), manipulate them and put them together.
When I reached an impasse with two rtmp sources, I turned to the Liquidsoap community on github, where they told me that I should try the pre-release version 2.2.x and that no more edits would be made to the 2.1.x branch, since all the emphasis is now on catching bugs and finalizing the new version. They also told me how to write the code for the multitrack correctly:

videos = playlist("path/to/mp4")

let {audio = video_audio, ...tracks} = source.tracks(videos)

mic = input.harbor("mount", port=, password="")
let {audio = mic_audio} = source.tracks(mic)

audio = track.audio.add([mic_audio, video_audio])

stream = source(tracks.{
  audio = audio
})

Having mastered the installation of rolling releases from github via opam (+pin add), as well as the creation of Ocaml switches to build a separate environment, so that, if anything, you can instantly switch back to the stable version without any problems.. I ran the script – and everything worked out!
The only thing worth noting is that the addition of images and text is made for the original video source, and only then it is divided into a multitrack (in general, the developers added the operator track.video.add_image, which allows you to add to a separated video track, but yet there is no track.video.add_text, so it doesn’t make sense for today).

videos = playlist("/path/to/mp4")
videos = video.add_image(x=0, y=0, width=1280, height=720, file="~/overlay.png", videos)
videos = video.add_text(color=0xFFFFFF,  speed=0, x=0, y=0, size=30, "Example Text", videos)

let {audio = video_audio, ...tracks} = source.tracks(videos)

mic = mksafe(input.harbor("live", port=8000, password="hackme", buffer=1., max=2.0))

let {audio = mic_audio} = source.tracks(mic)
audio = track.audio.add(normalize=false, [mic_audio, videos_audio])

stream = source(tracks.{
  audio = audio
})

Afterword with bugs

However, the joy of success did not last long: the first bug popped up with the overlay of images and text on top of the video (the operators video.add_image and video.add_text). The final stream started lagging very much, while overlays were not applied.
I thought that this was not a problem, since you can go back to the idea with an additional instance of Liquidsoap: run the main stream with all the graphics via input.rtmp in the second one and apply a multitrack in it.
And in order to avoid version conflicts (the main stream in this case runs on version 2.1.4, and the auxiliary one on 2.2.x), I also mastered creating Daemon for Liquidsoap – a system service that runs and manages applications in the background process. However, it turned out to be easy, because there is a ready-made script for creating them on github – liquidsoap-daemon.
It was just necessary to make sure that scripts can be run independently from different switches/Ocaml environments.
The algorithm is as follows: switch to a switch with 2.1.4 installed (opam switch), create a daemon and it is assigned to this environment. Then you can switch to the switch from 2.2.x and when you run the script 2.1.4. via daemon (sudo systemctl start script 214-liquidsoap) it will start with the correct version.
However, there was a bug here too. In 2.2.x, input.rtmp is recognized by the decoder, but it is impossible to connect to the output stream. And the problem is only in the video – if you output audio, then everything is fine. Accordingly, there is no such bug in 2.1.4.

I passed all this information to the developers and after a while they made edits for the video.add operators.
But then new problems came out: firstly, additional layers of text/images began to load the CPU very heavily, and secondly, the track metadata was captured incorrectly (with additional system symbols), which led up to the departure of the script if such data was output to the screen using the gd library.
And again I had to contact the developers directly.
The metadata bug was fixed very quickly.
Therefore, I decided that it is possible to reduce the amount of text displayed on the screen to reduce the CPU load and it is possible to start. And only after 10 hours of stable operation, I discovered that there was a RAM leak. RAM is clogged at 30-40mb per hour…

In general, summing up, we can say that the idea turned out to be quite workable.
We just need to wait for the fix of the bugs I found. But how long it will take is unknown.