The case for client side encoding
Today everyone has a small supercomputer in their pocket. Some of those devices also ship with 4k cameras, and soon enough, real-time AR (augmented reality) capabilities.
YouTube has been conceived in a completely different era, when personal computers used to be relatively slow, and flip phones where the latest hotness. Encoding videos on a centralized server farm has been the most logical solution for a while.
In this post, we will explore the possibilities of moving the encoding to client devices, and the benefits it incurs over the cloud based solutions of today.
Modern browser and WebAssembly
Web Assembly is an emerging web standard for building high performance web applications. It is in the process of being adopted by all major browser vendors, and by doing so, will allow developers to port their high performance C and C++ applications to the browser.
Here is an example of a high fidelity 3D rendering in Firefox web browser, powered by WebAssembly and WebGL2.
It is not unreasonable to expect, that we could not only perform efficient encoding and transcoding from the client side browser app (using tools such as the highly optimized FFmpeg), but also develop a responsive video editor.
Having a free, and easy to use video editing software, right in a browser, would empower amateur creators, as well as serve as a highly convenient option for shorter clips.
Encoding on mobile
Mobile devices like iPhone have been capable of video rendering for a while (iMovie for iPhone has been introduced in 2010). CPU's and GPU's in mobile devices have been improving at a rapid rate. Here is an example of improvements to the graphics fidelity in games.
It is safe to assume that modern mobile devices are perfectly capable of rendering and transcoding videos. Furthermore, modern frameworks such as Apple's Metal could enable us to leverage the parallel compute capabilities of mobile GPU's.
Reducing bandwidth costs and upload times
While the computing capabilities have been improving exponentially, the global internet speeds and available bandwidth have not. To make matters worse, many monopolistic telco providers impose outrageously small monthly rations of bandwidth to their clients.
We can decrease the bandwidth costs, as well as the amount of time it takes for videos to upload, by encoding videos on the client devices.
H.264
A 1 minute 4k video in its original format, as shot by the iPhone 7, takes 357 MB of space. With MPEG-4 avc1 encoding, the file size is reduced to 177 MB. Transcoding into multiple resolutions, the aggregate size is 245 MB, a bandwidth saving of 32%.
Resolution | Size |
---|---|
Original 4k | 357 MB |
Encoded 4k | 177 MB |
Encoded 1080p | 38 MB |
Encoded 720p | 20 MB |
Encoded 480p | 10 MB |
Encoded All | 245 MB |
VP9
VP9 is an experimental technology, created by Google as an alternative to the proprietary H.265. It offers further reduction of encoded file size while retaining or improving visual fidelity.
Towards server-less architecture
Moving video encoding step to the client will allow us to remove dependence on centralized upload servers. After the encoding step, client applications can publish the videos with the thin JSON-RPC wrapper talking to full blockchain node(s). Thanks to the recent addition of WebRTC support in js-ipfs, clients could seed their videos to the other nodes on the network, right from the browser.
Join us on Telegram
If you're passionate about empowering creative people and their fans, now is the time to join the discussion.
Telegram Group: https://t.me/viewly
Sign up for updates on https://view.ly
Great project... i am glad to see your name @furion on core team list of Viewly.... wish you all the success with this revolutionizing project.
This does sounds brilliant, but would you not require many transcodings for the different browser formats and adaptive streaming, and thus significantly increase bandwidth requirements between the Viewly node and the network when uploading (essentially each video would need to be uploaded several times). If so, from an average end-users perspective, that may go down so well.
Still there has to be some price for freedom, and I'm happy with slightly longer uploads!
Is X264 not under any threat due to H264 patents, would that also need be a consideration for you guys if you were developing an encoder?
You are exactly right. However it turns out that even with transcoding into multiple formats, the total upload size is often smaller than original, due to the efficiency of encoding itself.
x264 is quite common, and we do use it (implicitly). I am really looking forward to royalty free H265 adaptations (VP9 et al).
Oh yes I see, I hadn't thought of that!
So should ffmpeg and x264 compile into WebAssembly too?
I hope to join the discussion on Telegram when I've some more time.
This is very interesting progress from a technical point of view... but what about legal challenges? The owners of the nodes, and viewly as a whole will be held accountable for the content that they allow to be broadcasted, going from copyright infringement to much darker content... Are you already working on preventing this kind of content or will you wait till legislators say something?
Excellent point. I am preparing a writeup to address this topic head-on.
In short, a transparent, opt-in system will be put in place to allow for crowdsourced self-regulation. Compliant nodes will have the ability to drop the infringing and questionable content.
Hi @furion - I have read your post but coming from the older generation - a lot of it is greek to me. However, I am beyond excited about a replacement for YouTube and look forward to uploading many videos in the future.
Have upvoted and resteemed
I have tried to upload to viewly but there wasn't any way to connect it with my Steemit account or create a Viewly account.
The account support is coming. Its a bit tricky, since we are not yet sure on how to handle it.
1.) Should we have a centralized account system like every other website, and overlay a loose blockchain integration for payments.
2.) Should we go with the steemit model, and handle accounts and their private keys in the web app itself.
I can't really say what would be better. But I am very interested in further Steem revenue streams. That would be awesome! I love your project and I want to support it. I t does remind me of LBRY though. Have you heard about it?
Yup, LBRY is cool.
Something you failed to address was battery life. If clients are doing the encoding themselves, their batteries are not going to last as long. This could possibly even prevent them from watching the video.
As far as encoding/uploading goes, I assume people usually do it when they are connected to wifi, which means they are also probably indoors and have access to power.
For the video playback, we plan to integrate DASH, so mobile devices would stream lower resolution/bitrate video, which consumes less bandwidth, lower antenna usage and less CPU time for decoding.
Have you looked into how this could impact the battery life of mobile devices? I would assume that transcoding locally is more CPU/GPU intensive than just uploading.
Answered in reply to @marcusorlyius.
Thanks.
Really rooting for Viewly, and this is a creative way to approach one of its biggest outstanding problems. Long term this will probably be a good way to go, but in the nearish term it feels like a bit of wishful thinking for a few reasons.
Neither of these things are deal breaker size in and of themselves, but together they're just worse than existing platforms. Usually you can't compete against an incumbent unless you're 10x better. This feels like 2x worse.
Luckily, I do think there are ways to solve this with decentralized infrastructure. I think the eventual path for this will really be using a completely different decentralized system that the user is never aware of in order to do this processing, but it would be paid for by a fraction of the steem value given to the content. Whatever agrees to do the processing before the content has views is taking on the liability of the cost up-front though which creates a huge hurdle.
Can't wait to see what you come up with, or if doing this client side will actually just work out for now (would be cool if that's the case).
In a decentralized system, where immutability guarantees are in place, re-encoding is not possible regardless of clients willingness to re-encode.
As far as battery life is concerned, its like playing a 3d game. During encoding, a fair amount of battery power will be consumed, so you're probably going to want to plug your device in for long clips. Typically, you will also want to upload long clips on wifi, likely near a power source, so I don't see where is the problem here.
Most streaming/video apps already do partial, if not full encoding on the device.
Really interesting article. I would like to see what WA can do for a new generation of front-end web applications. Thinking of how it compares to Reactjs. Essentially we could code in any language: js, c and compile down to machine code.
#GameChanger