Creating an Interactive Service

Interactive Services

Interactive was a technology that allows the viewers of a stream to interact with the streamer in unique ways. Previously, you could only interact with the streamer via chat. Interactive technology allowed an overlay to be built over the stream. Streamers could customize the overlay and upload their own content. The overlay was a full webpage with custom functionality. Both Twitch and Mixer (formerly Beam) have this feature.

Here is a great example of the capabilities of interactive.

Overview

Twitch

Twitch used interactive through extensions. Streamers can install as many as they want simultaneously. Some extensions are located in the streamer profile page while others live on the stream layer. Most of the extensions are rather basic. Playing sound effects, GIFs, and special messages are common.

The most advanced interactive creation I could find was games with interactive support. Hyperscape being the prime example. The game can be played without a Twitch account but you have the option to link your channel to the game. Viewers can vote on the next in-game event via interactive. This has a real effect on the final outcome of the game.

On the downside extensions have been used in excess. Most streamers have many running at the same time which causes lag. The stream is covered with visual noise harming the viewing experience. Extensions can also be monetized. Many have practices for extracting as much revenue as possible.

Earning revenue off of interactive is not a bad thing. It needs to be implemented correctly.

The development experience is not very good either. All extensions must be manually approved before being used. The documentation is confusing and many live features are missing. Some events in the API are delayed or don’t go through at all.

This is likely due to the normal API, not a result of interactive.

The best part of the development experience is the dev rig. This allows you to create extensions and simulate events from Twitch. This is very useful in development.

Mixer (Beam)

Mixer implemented interactive through a service called Mixplay. Users can have one active interactive session at a time. Mixplay could live on the stream or anywhere on the page. This allowed all elements of the page to be moved around. However, it was rarely used. Like Twitch extensions, Mixplay was mostly used for soundboards and GIfs. “Button Boards” were created under the stream. Each button would play a sound or show a GIF. Bots such as Mixitup and Firebot helped create these. The stream was usually moved up and squished to make room for the buttons. This harmed the viewing experience.

Since Mixer was owned by Microsoft many Xbox games had interactive support. However, they were rarely used. Story games such as The Walking Dead allowed viewers to vote on the dialogue options. Minecraft had many mods created for linking to interactive. Viewers could spawn mobs and start various events. Viewer could also use a virtual currency called “Sparks” which added more weight to certain triggers. Some channels ran 24/7 streams where viewers could send inputs to a game (Youplay/Smashbets). Viewers could play through Pokemon or other simple games. Streamer SorryAboutYourCats and DaddyRobot also made unique interactive experiences. They linked interactive to robotics, voice recognition, minigames, and other integrations.

The downside was that nearly all streamers used Mixplay in the exact same way. It led to the belief that interactive could only be used for Button Boards which prevented new, unique sessions from being created. The development experience was improved over Twitch but few improvements were made after season 2.

The docs for Mixplay can be found here. I suggest looking at the introduction and the custom controls.

Mixplay had the most potential but it was largely unnoticed.

How it Worked

Both Twitch and Mixer implemented the interactive service in a similar way. Devs upload a web frontend (HTML/CSS/JS) for the viewing experience. Twitch/Mixer stores the files. When a session is started the page is activated. Both services allowed communication between the frontend, the service(mixer/twitch), and the streamer’s server. This was what allowed for most integrations into games and other programs. Interactive is just an Iframe over the stream layer. Both services have helper functions for communicating with the service API to get users stats and state.

How We Can Build It

Terminology

  • Interactive: The container for interactive sessions. Bridge between Glimesh and a users server.
  • Session: An interactive app. Each project is its own session. This is stored in the database along with the view.
  • View: The frontend of the session that is visible to users (HTML/CSS/JS)

API Support

Interactive Sessions

The Glimesh API must be able to return the users uploaded sessions and any active sessions. General info should be publicly accessible. An interactive session could contain the following fields:

  • Name (selected by the creator)
  • ID (auto generated unique identifier)
  • Files (URL to the uploaded files)
  • UpdatedAt/CreatedAt (standard datetime fields)
  • CreatorID (The userId of the creator)
  • isActive (Is the session running?)

I don’t believe the interactive session requires its own API. The current graphQL subscription API is more than enough. It will need to be updated to display the current users in the interactive sessions and emit events though.

Events (sending/receiving custom data)

We will need to be able to send custom data to and from Glimesh through interactive. This could be a button click, user data, game state, etc. We will likely have the session frontend send the data to Glimesh. Glimesh will send it to the streamer’s server. The server can do whatever it wants with the data and then send it back to Glimesh and then to all clients.

Glimesh will never store data (state). Glimesh will simply pass the data along to all clients. The streamer’s server will store the state and any other data.

An event will be a JSON packet with the following structure:

{
    "type": "event",
    "eventName": "customEventName",
    "timestamp": "TIMESTAMP",
    "customField1": 123,
    "customField2": {},
    "customField3": "xyz"
}

Devs can attach any needed properties. Communications between Glimesh and the frontend will be via websocket.

Glimesh will also need to send events via subscriptions. We will need support for a user join/leave and a session start/close.

Communication Example

The session will be an app that stores user points. Points are earned whenever a user presses a button in the overlay. There is also a leaderboard which displays everyones points in real time. Whenever a user presses a button their username and current count are sent to Glimesh, and then to your server. Your server will add the tally and return the updated values of all users to Glimesh, and then to all clients. The clients will update the display of the leaderboard.

Security

We have to sandbox the Iframe to prevent it from interacting with the Glimesh site.

We should also have the session request any required scopes for the session before the user is allowed to join. If the user denies the scopes requested they shall not be allowed into the session. Once a user has approved the session scopes an access token should be returned. Requesting public user properties can be done through the API without the need to authenticate through a session. The server will already have an accesstoken or clientID to use for those fields.

We could just implement the full OAuth flow for sessions.

We may need to sanitize the uploaded files to scan for common vulnerabilities. TOS will need to be updated to explain what is valid. While some review may be necessary we do not want to get in the way of creators. Personally I think we can allow the user to create and use others sessions without manual review. Once the user wants to share their creation it would be manually reviewed before being released to Glimesh. The GCT would handle this.

Reviewing interactive creations is difficult because the streamer must be running their server to test the full product. Some (if not all) require a server to work.

Distribution

I have thought of 2 possible ways to distribute interactive sessions.

The first is to have the dev manually send the files and have the streamer upload it themselves. This would work but would require more storage space for Glimesh. Many popular sessions would be duplicates of one another.

The second is to make an online “store” where devs can upload interactive content. Users can “install” interactive sessions. This would allow the database to point towards an already existing session instead of storing multiple copies of the same session. The store should be optional. If a streamer wants to make a private session they must be able to. This would also open up the option for monetization.

Given the nature of client side apps the frontend would be visible regardless. The sensitive code would need to be behind a server.

Creating Interactive Sessions

I believe the best approach is to make a low-code GUI for creating sessions. We can ease the user into it and slowly introduce the needed concepts. This would probably be a desktop app as integrations with the server and users file system are much easier. Mixer released the source code of their CDK for creating mixplay boards. While it had some problems we may be able to use some code from that. Alternatively, we can make something from scratch.

I personally prefer Electron. SInce it is already in a browser environment creating frontend sessions will be much easier. It has full access to the DOM and can optionally use NodeJS modules.

We will need to create a few example sessions. A simple server and frontend will be required.

Breaking the Button Board Ceiling

Interactive must be improved over other services. We need to make it extremely clear what interactive can do. We also need to have very precise documentation about how to make sessions. Currently making interactive sessions is exclusive to developers. We should be targeting the general population, not just devs. While there are some programming skills required the real blocker is people not trying. People believe interactive to be too complex to create. We must not make that a reality!

I’m aware this is not a priority. We need to focus on growth both in site traffic and finances. However, I know this is something that a lot of people want. If it is done correctly it can bring more people in and allow for unlimited creativity. If we show that this is something we want to create and show a potential plan we can likely pull in some volunteers to help build it. I’ll volunteer to help with whatever I can. My elixir skills are not very good but I have a lot of experience with Electron, web dev, and writing documentation :slight_smile:

All of the above is open for discussion. This is just one potential way that it could work. I’m very interested to hear what the community has to say about this!

4 Likes

As i understand it, mixer/beams interactive was not an html page in an iframe, but a canvas element that could be drawn to, plus event handlers for mouse up/down/over key up/down, gamepad events etc. Even in an iframe i dont think we should let people run arbitrary scripts

I’m fairly certain it was an iframe. It was sandboxed so it had minimal interaction with the main page. If we go with this approach we would have to do the same. You were able to run JS scripts and talk to the interactive server via websockets which can’t be done in a canvas.

1 Like

I’m so very fucking glad you wrote all of this Mytho! :heart_eyes_cat: Glimesh having easy interactive controls integration [on top of FTL] would greatly differentiate itself from other platforms, bringing all that unlimited creativity from Mixer and then some.

What differentiated Mixer/Beam from the other platforms was FTL and interactive controls as well, but one of Mixer’s downfalls, besides investing in strange ways & that racist lady, was not having more partners take advantage of Mixer’s unique features [as you stated too].

I did have hope in E3 2017, when Microsoft helped build and set up an arena for “Interactive Robot Soccer”, one of the many ideas pitched. Two robots, controlled by the people at E3 and the viewers [peak ~8,000 on Mixer], had to work together and hit a ball :joy_cat:. Most of the time it was just chaos, but once in a while they scored. There were also lights to be controlled to support your team.

Sorry for the babble. :smile_cat: I just deeply believe interactive done right adds another layer of human connection and gives people new and unique experiences; allowing anyone to create these experiences for others with no programming background at all would be amazing for everyone involved.

Other interactive examples:
Shooting dart + slap = dart coffee

Tips and Tricks - balloon popping:
youtu.be/EgjfRtDB9hg

Level Up Cast - nail polish applying:
youtu.be/0Gf4yTG751c

[Excuse the broken links due to new user limitation! :upside_down_face:]

I have high hopes for unique interactive content on Glimesh in the near future. :smile_cat::+1:

2 Likes

I’ve been thinking about how the server side can be built and would like some input on the below areas.

Server Location

The server can live within the Elixir project or as its own microservice.

If interactive is built within the Elixir codebase it will be easy to test. It is also perform database and authentication actions without much extra work. Since it is within the same project adding the necessary data to the API will also be simple. The downside is bringing in volunteers. While Elixir powers the site well it is not an easy language to learn. Many people want to help build Glimesh but leave after a few days of attempts. This is a bigger problem but deserves its own thread.

Alternatively, Interactive can be built as its own service. We could change the programming language and potentially bring in more help as a result. Mixplay was written in Go but we can use any language. While performance and scalability isn’t an immediate concern we do need to make sure that it will be sustainable. At this stage we could likely run every session on 1 server but that won’t be the case forever.

I’m assuming Clone will want it to stay in the Elixir codebase but its important to consider all options.

Data Transport

This only applies if we build interactive as a microservice. If Elixir is chosen it will use standard websockets for communication

Another thing to consider is the method of communication between Glimesh and the view. Realtime communication is a must have. This limits us to 3 options.

Websockets

This is the most common method and is the most well known. Websockets are supported everywhere and are pretty fast. The client side (view) would use the browser support while the server would likely need a basic third party library. In NodeJS this would likely be ws but each language has its own packages to choose from.

Socket.IO

SocketIO is a library built on top of websockets. It supports the same functionality as normal websockets but adds some additional features such as rooms, broadcasting, and others. We will have to build these ourselves if we don’t choose this library. It is lightweight but not as performant as standard websockets. The client socketio library would have to be included in the view. The server would need the language library to host the server. It has several options for linking to Redis to handle load distribution and ensures every client gets the data it is subscribed to.

WebRTC

WebRTC can be used to transfer more than just video streams. The latency is very minimal and may be faster than standard websockets! However, this is the option that I know the least about. This is added to the list because of a very old Discord conversation.

I did some work on this. I managed to get a very basic session running. The backend is a nodejs server running ws and express. The first library is for a websocket connection and the second is for serving pages to the user. The streamer’s channel has an iframe over the video layer which connects to the express server.

This was mostly a proof of concept test. If I am allowed to run an interactive server separate from the elixir server it will likely be written with NestJS. Its in Typescript, great for writing scalable server side applications. This would only be the interactive code for managing sessions and passing data to and from clients. This will not have any communication with the Glimesh DB. Any interaction with Glimesh will need to be through the API. This will require some work on the elixir side.

I will need some input from @clone1018 to determine if the server can be a separate microservice as described above, or of it all needs to be in the elixir project.

1 Like

Honestly for a prototype, it doesn’t really matter. However when it comes to Glimesh core services I’d prefer any work we do be focused around Elixir & the ecosystem we already have setup. It’s one thing to be able to develop a new service, but it’s another burden entirely to maintain & support that service over time. I don’t believe there’s any value in using another language or framework for this particular purposes, the only benefit I see is that you already know JS / NestJS. Elixir is already extremely capable of being used for scalable server side applications.

As far as the point goes about Elixir being hard to learn, that’s debatable, as the language is quite small and concise. Whereas whenever you jump into a new JS project, you have to learn the framework of the moment, along with the language of the moment (JS, CoffeeScript, TypeScript, Dart, etc). I haven’t seen any evidence that switching languages would bring in any additional support. We do already have repositories in TypeScript, Go, C++, and Dart, and none of them have had above normal outside contributions.

WebRTC is an interesting thought, as the Data Channels support some pretty cool things Data Communication | WebRTC for the Curious

1 Like

Thanks for the input.

I’ll play with WebRTC and see if I can get some data sent across. I found some examples so I should be able to make that work. The latency will be lower which is always a good thing.

I’ll keep testing this in node but I’ll try to make the real thing in elixir.

Pion is a really neat framework for WebRTC in Go. They have several examples here: webrtc/examples at master · pion/webrtc · GitHub

WebRTC findings:

P2P wouldn’t work well for this since every client will need to be connected to every other client. If we use WebRTC we will need to use the server approach (likely MCU). The server will have to combine all of the incoming streams to 1 outgoing stream. This is supposed to be an expensive operation but seeing as we would only be transferring JSON instead of audio/video I don’t think that will be an issue.

Should we build our own MCU or use an existing one?
Pion should have something we can use. Membrane is a webrtc library for elixir. However, it is solely focused on audio/video. We may be able to make a plugin for data channels though. I got their example phoenix project running and it seemed alright. The only problem is they expect the connection to be made over phoenix channels which is not an option for this. All interactive projects have to be within a sandboxed iframe. This prevents communicate between phoenix and the project. We would have to make some API calls or setup a separate websocket endpoint to send the SDP data to all clients and get everyone connected.

I don’t know how much latency adding a server between clients will add. The only reason I’m considering webrtc is because of the slightly lower latency. If this increases the delay there isn’t much of a point to use it. It has more steps to connect and most people are familiar with websockets vs webrtc.

I’ve been learning how our API works and managed to add an early prototype of interactive to it. Currently this is just an extension of our websocket API (not WebRTC, see above post). Lots of hard coded stuff atm but I wanted to share some progress and get some feedback.

API Additions

I added a subscription interactive which requires a session ID. This will likely be the channel ID since every channel will have their own session. I added a mutation sendInteractiveMessage which sends custom data across the connection. It takes in a sessionId, eventName, and message.

subscription {
	interactive(session: 1) {
		data,
		eventName,
		type
	}
}
mutation {
	sendInteractiveMessage(sessionId: 1, eventName: "yourCustomName", message: "This will need to be JSON at some point :)") {
		data,
		eventName,
		type
	}
}

This allows for easy communication between clients.

With this approach anyone can send a message to any session. There are good and bad things with that.

Its very easy for devs to make sessions and prevents you from having to go through the oauth process for every client (viewer). It also allows for a third party dev to act as the server for many streamers at the same time. I could see third-party services being created for this.

Once the core functionality is done we will have to look into a good way to share/distribute sessions.

The downside is anyone could spam the API with meaningless requests attempting to break a streamers session. I believe that is is the dev/streamers responsibility to parse the data correctly but we will need a way to check if a request is sent from an “official” source.

Every client(viewer) needs to be able to send data back to the server. This is very hard to validate since it is sent clientside. I don’t think there is much we can do about that. Its likely impossible to tell if the viewer triggered an intended function or just opened the console and hit their keyboard :slight_smile:

We do need a method to know if a message is sent from the streamers server (official source). This would likely have a property isOfficial or similar. I think we can allow any user with a clientID to send messages (every viewer) but require the streamers server to use an accessToken if they want to send a message with isOfficial. We can use the token to make sure they are the streamer (or on behalf of the streamer). I’m not sure this is the best approach but I think it would work if no other ideas are presented.

Did some more work on this over the past few days. Here is some interactive confetti :confetti_ball:

Authentication

I think I have a solution for validating if a request is “official” or not. Any user with a client ID (basically every dev. The view will likely contain the streamers clientID for each viewer) can send a request to any session. This will set the is_server prop to false. This is the easiest to get set up and won’t require a bunch of oauth request for every user in the session.

is_server is just a placeholder name for an authorized request

Note the response on the right

Any user with an access token containing the new interactive scope will set the is_server property to true. The client will be subscribed to the API and can check the property to know if the request was “official” or not. This is up to the streamer to implement and is entirely optional.

Note the response on the right

This should be enough validation for us. The streamer can optionally implement additional safeguards if they choose to do so. This would be between their server and the clients.

Sending data

We can now send data with the data prop in the send_interactive_message mutation. This must be sent as a string containing the JSON data we want to send to Glimesh/clients. This should allow for any data to be sent across the connection.

2 Likes

it would be great to have a way to resize or reposition the video player. This would allow you to have an interactive sidebar menu (I think beepbot did that), or make room for a buttons panel below the video player, rather than forcing it to be overlaid on the player.

overlaying content on the player for a specific user only (such as confetti) seems like a super-cool user-experience option though - eg. a timer could announce “It is your turn to control the game [Click Here]”, or you could even use it for follow/sub/dono alerts.

I always thought requiring the streamer to add an OBS overlay was a bit of a limitation, especially if the interactive app is not on the pc/device running the stream. On mixer I remember console streamers being confused when their interactive bot wouldn’t work with their xbox stream (the buttons would appear, but they had no affect on the stream).

Imagine an interactive experience with audio playing capabilities, where viewers could mute sound effects, or adjust music volume for just themselves. These experiences [via interactive] would leave the OBS recording clean for use on youtube or for clips - you may not want follower/sub/dono alerts popping up constantly if you plan to turn the stream into an edited youtube version

1 Like

I thought about resizing the player. It can be done but there are a few issues.

  1. The main page would have to be changed to utilize more phx event listeners. The Iframe cannot be allowed access directly so it would be sent over the API. Not really a problem though, its just not a direct approach.
  2. I don’t want to encourage squishing of the video to bring back button boards. Interactive is meant to brigde the gap between streamer and viewer and making the video smaller isn’t visually appealing. In my opinion, button boards were a problem because nobody ever did anything more than buttons. If people see this they will just create more button boards. Mixplay had mixed(heh) reactions. People already think we are related to mixer and this wouldn’t help matters. We would likely see exactly what mixplay had before.

We can always overlay the buttons on the stream. We should also encourage users to add a hide button so you don’t have to have something covering the stream. When you want to play a sound effect open the menu and do so, close it when you are finished. While that can be done within the session all streams will have an interactive toggle button per user. Nobody should be forced to use it if they don’t want to.


Currently we don’t know who a user is when they connect unless its by the streamers access token. The user would need to send their username or ID to your server. When you want to send something to a specific viewer you would add their name like the below query. The confetti was from an NPM library. When I write the tutorials for this I can include it in some examples. Its not part of the API. (maybe it should be, confetti is fun :slight_smile: )

mutation {
	sendInteractiveMessage(sessionId: 1, eventName: "newSub", data: "{
"confetti": true, "for": "Cykotiq"}
}") {
		data,
		eventName,
	}
}

Removing the OBS overlay is an interesting thought. With my approach you wouldn’t need an overlay at all. You could include the sounds in your interactive project and play them locally per viewer. We may want to suggest that users offload their media to a CDN so we don’t have to host gigs of sound files.

I’m not sure if our sandboxing will allow another iframe in an iframe(browser alertbox) but the user could always have their server listen for the event and broadcast it to all clients. A little more work but the streamer could allow custom positioning, volume, even removing alerts altogether if the viewer wants to. The streamer would have to code this but it can definitely be done.

1 Like

We need to decide how to store/distribute interaction creations.

Previous Implementations

Mixer would store the files for you. To start your session you would have to have a server open a connection. I believe they did have a way to publish mixplay boards but I never saw anyone use it.

Twitch also stores the files. I don’t think an EBS (server, extension backend service) is required. Twitch is very strict with interactive and requires all extensions to bemanually reviewed. They have the ability to publish/distribute extensions on their website. Some extensions allow you to add it to your own stream from another stream.

The sharing process is pretty good but the creation process is not.

Glimesh Implementations

I see a few possible choices.

Option 1

The first is to allow the user to upload their files. This is likely the best option. Sessions can be manually reviewed. The downside is hosting the files would have an added cost. We can encourage users to upload all non code files to a CDN or hosting service but we will have to store some files. I’m not sure how much that will cost.

Option 2

The other option is to let the user enter a github pages URL for their project. It is free for individuals so no cost will be required by Glimesh or the streamer. Streamers could easily fork the repo to make their own version or “install” the session for themselves. Github probably has a filter to prevent the really bad stuff from being uploaded. I don’t want to get into moderation just yet but that will need to be considered.

Option 3

The last option is to let the user choose any URL. This has obvious problems and probably shouldn’t be considered. This gives full control to the streamer but even with a URL blocklist this isn’t very safe.

I’d like to get some input on this.

what if the client (the streamer’s end) just sends the files upon session start and they are destroyed when the session ends?

1 Like

That’s a really good idea! We will probably need another endpoint for uploading files. As for moderation we can use the same approach to moderating streams. If someone sees a problem it is reported and dealt with. When we get to distributing sessions we may have to change that but for now that should be fine.

I’ll have to look into how we do file uploads. We have a lib called waffle but I haven’t used it before. I’m thinking we ask the user to upload a folder with any amount of files required for their session. The only requirement would be an index.html page to serve and a size limit. We can ask the user to upload all non code files to a CDN if they are above a certain size. Although storage costs won’t be a problem the lower the size the faster the load times will be.

Loving this whole thread.

If I may, I would like to say that in my opinion, a retractable overlay is much better than any resizing of the stream video. Personally I kinda rely on the stream video to remain consistent in size and shape for my livehosting command so I do have a bias. For all I know I may be missing the point of that part of the discussion, I don’t know.

My main focus is to get this finished as an overlay with an API layer for communication. The video size/position will not be part of the feature. If we can find a good way to justify it I would consider it but right now I want the core features done. I wrote a little about that in an above message.

The overlay can be turned off via a button next to the follow button. We can likely let the user opt in or out of automatically loading interactive sessions via a setting.

Another interesting use-case for a user-specific
(Browser-side) overlay – shouting out streamers to only viewers that are not already following the streamer. Inspired by this:

Why should it be built into the platform, when you can have significantly more flexibility with a third party interactive app!

1 Like