The case for client side encoding

in #viewly7 years ago

Today everyone has a small supercomputer in their pocket. Some of those devices also ship with 4k cameras, and soon enough, real-time AR (augmented reality) capabilities.

YouTube has been conceived in a completely different era, when personal computers used to be relatively slow, and flip phones where the latest hotness. Encoding videos on a centralized server farm has been the most logical solution for a while.

In this post, we will explore the possibilities of moving the encoding to client devices, and the benefits it incurs over the cloud based solutions of today.

Modern browser and WebAssembly

Web Assembly is an emerging web standard for building high performance web applications. It is in the process of being adopted by all major browser vendors, and by doing so, will allow developers to port their high performance C and C++ applications to the browser.

Here is an example of a high fidelity 3D rendering in Firefox web browser, powered by WebAssembly and WebGL2.

It is not unreasonable to expect, that we could not only perform efficient encoding and transcoding from the client side browser app (using tools such as the highly optimized FFmpeg), but also develop a responsive video editor.

Having a free, and easy to use video editing software, right in a browser, would empower amateur creators, as well as serve as a highly convenient option for shorter clips.

Encoding on mobile

Mobile devices like iPhone have been capable of video rendering for a while (iMovie for iPhone has been introduced in 2010). CPU's and GPU's in mobile devices have been improving at a rapid rate. Here is an example of improvements to the graphics fidelity in games.

It is safe to assume that modern mobile devices are perfectly capable of rendering and transcoding videos. Furthermore, modern frameworks such as Apple's Metal could enable us to leverage the parallel compute capabilities of mobile GPU's.

Reducing bandwidth costs and upload times

While the computing capabilities have been improving exponentially, the global internet speeds and available bandwidth have not. To make matters worse, many monopolistic telco providers impose outrageously small monthly rations of bandwidth to their clients.

We can decrease the bandwidth costs, as well as the amount of time it takes for videos to upload, by encoding videos on the client devices.

H.264

A 1 minute 4k video in its original format, as shot by the iPhone 7, takes 357 MB of space. With MPEG-4 avc1 encoding, the file size is reduced to 177 MB. Transcoding into multiple resolutions, the aggregate size is 245 MB, a bandwidth saving of 32%.

ResolutionSize
Original 4k357 MB
Encoded 4k177 MB
Encoded 1080p38 MB
Encoded 720p20 MB
Encoded 480p10 MB
Encoded All245 MB

VP9

VP9 is an experimental technology, created by Google as an alternative to the proprietary H.265. It offers further reduction of encoded file size while retaining or improving visual fidelity.

Towards server-less architecture

Moving video encoding step to the client will allow us to remove dependence on centralized upload servers. After the encoding step, client applications can publish the videos with the thin JSON-RPC wrapper talking to full blockchain node(s). Thanks to the recent addition of WebRTC support in js-ipfs, clients could seed their videos to the other nodes on the network, right from the browser.


Join us on Telegram

If you're passionate about empowering creative people and their fans, now is the time to join the discussion.

Telegram Group: https://t.me/viewly
Sign up for updates on https://view.ly

Sort:  

Great project... i am glad to see your name @furion on core team list of Viewly.... wish you all the success with this revolutionizing project.

This does sounds brilliant, but would you not require many transcodings for the different browser formats and adaptive streaming, and thus significantly increase bandwidth requirements between the Viewly node and the network when uploading (essentially each video would need to be uploaded several times). If so, from an average end-users perspective, that may go down so well.

Still there has to be some price for freedom, and I'm happy with slightly longer uploads!

Is X264 not under any threat due to H264 patents, would that also need be a consideration for you guys if you were developing an encoder?

You are exactly right. However it turns out that even with transcoding into multiple formats, the total upload size is often smaller than original, due to the efficiency of encoding itself.

x264 is quite common, and we do use it (implicitly). I am really looking forward to royalty free H265 adaptations (VP9 et al).

Oh yes I see, I hadn't thought of that!

So should ffmpeg and x264 compile into WebAssembly too?

I hope to join the discussion on Telegram when I've some more time.

This is very interesting progress from a technical point of view... but what about legal challenges? The owners of the nodes, and viewly as a whole will be held accountable for the content that they allow to be broadcasted, going from copyright infringement to much darker content... Are you already working on preventing this kind of content or will you wait till legislators say something?

Excellent point. I am preparing a writeup to address this topic head-on.

In short, a transparent, opt-in system will be put in place to allow for crowdsourced self-regulation. Compliant nodes will have the ability to drop the infringing and questionable content.

Hi @furion - I have read your post but coming from the older generation - a lot of it is greek to me. However, I am beyond excited about a replacement for YouTube and look forward to uploading many videos in the future.

Have upvoted and resteemed

I have tried to upload to viewly but there wasn't any way to connect it with my Steemit account or create a Viewly account.

The account support is coming. Its a bit tricky, since we are not yet sure on how to handle it.
1.) Should we have a centralized account system like every other website, and overlay a loose blockchain integration for payments.
2.) Should we go with the steemit model, and handle accounts and their private keys in the web app itself.

I can't really say what would be better. But I am very interested in further Steem revenue streams. That would be awesome! I love your project and I want to support it. I t does remind me of LBRY though. Have you heard about it?

Yup, LBRY is cool.

Something you failed to address was battery life. If clients are doing the encoding themselves, their batteries are not going to last as long. This could possibly even prevent them from watching the video.

As far as encoding/uploading goes, I assume people usually do it when they are connected to wifi, which means they are also probably indoors and have access to power.

For the video playback, we plan to integrate DASH, so mobile devices would stream lower resolution/bitrate video, which consumes less bandwidth, lower antenna usage and less CPU time for decoding.

Have you looked into how this could impact the battery life of mobile devices? I would assume that transcoding locally is more CPU/GPU intensive than just uploading.

Answered in reply to @marcusorlyius.

Really rooting for Viewly, and this is a creative way to approach one of its biggest outstanding problems. Long term this will probably be a good way to go, but in the nearish term it feels like a bit of wishful thinking for a few reasons.

  • Not all clients can handle the load. It would crush battery life of mobile devices. You'd have to re-encode many times to support all the formats users might want to play. It's just so cumbersome
  • It blocks your future proofing as far as any formats you want to support in the future. Sometimes centralized video services do a full pass over all their content to support something new or optimize things. You can't go back in time and ask clients to re-encode everything to something new

Neither of these things are deal breaker size in and of themselves, but together they're just worse than existing platforms. Usually you can't compete against an incumbent unless you're 10x better. This feels like 2x worse.

Luckily, I do think there are ways to solve this with decentralized infrastructure. I think the eventual path for this will really be using a completely different decentralized system that the user is never aware of in order to do this processing, but it would be paid for by a fraction of the steem value given to the content. Whatever agrees to do the processing before the content has views is taking on the liability of the cost up-front though which creates a huge hurdle.

Can't wait to see what you come up with, or if doing this client side will actually just work out for now (would be cool if that's the case).

In a decentralized system, where immutability guarantees are in place, re-encoding is not possible regardless of clients willingness to re-encode.

As far as battery life is concerned, its like playing a 3d game. During encoding, a fair amount of battery power will be consumed, so you're probably going to want to plug your device in for long clips. Typically, you will also want to upload long clips on wifi, likely near a power source, so I don't see where is the problem here.

Most streaming/video apps already do partial, if not full encoding on the device.

Really interesting article. I would like to see what WA can do for a new generation of front-end web applications. Thinking of how it compares to Reactjs. Essentially we could code in any language: js, c and compile down to machine code.

#GameChanger

Having a free, and easy to use video editing software, right in a browser, would empower amateur creators, as well as serve as a highly convenient option for shorter clips.