Refreshing the Blender VSE

Hey,
The userbase of the VSE isn’t particulary big, even among Blender users - which I think is a pity because I see so much potential to become an extremely powerful tool for editors, and I’d love to see it succeed!

There are some really awesome projects going on, for example the power-sequencer add-on from @nathan_gdquest or the VSE transform tools) from @doakey3, or the blender velvets.

However, I believe to make the VSE a truly powerful feature, We’d have to come up with some completely new concepts and mechanics, and bring them into Blender itself.
The past week I’ve been thinking about this and I came up with the idea of Compositions. It is the idea of using the node-based image editor for most operations, such as color-grading, transformations and compositing.

I made a post explaining my idea on rightclickselect. I’d love to know what you think!

5 Likes

Thanks for the ping! There are some solid ideas there, but hard to discuss at the moment as the VSE still lacks a maintainer/active developers working on it.

There were plans for “everything nodes” in Blender right. It’d also make sense for the Sequencer. At the same time the sequencer’s playback engine is a bottleneck at the moment. With proxies > 480*270 you often can’t get smooth playback. This along with how easy it is to mess up your project with framerate, and no playback speed control, I don’t see too many editors and vfx people willing to work with Blender’s Sequencer. Especially with a free beast like Resolve in front of it.

3 Likes

thanks, @nathan_gdquest!
The question then is, how do we get developers to work on the VSE? A possibility is to get users so hooked up for the ideas that there is a bigger demand for it. Blender will have to offer features programs like resolve doesn’t have (which is pretty hard tbh), because “open source” is not really enough for most users.
One of them could be a truly tight integration into the 3D workspace. I was thinking about the possibility to load compositions as input nodes for materials. So you could, for example, you could easily green-screen video clips, and add them as planes in your 3D scene. Or create TV videos without rendering them first, or any kind of complex video texture. Or linking trackers from the VSE to 3D objects in the scene, so that you could move the scene strip and the video strip while keeping the track in place.

I guess I’ll create some more mock-ups and share them sometime soon. If people see how the VSE can become a really handy tool for 3D artists, and video editors can benefit along with them.

In my experience working in open source, it’s continuous development, improvements, fixing core issues, and communication around the tools that “hooks” users. Seeing the program fits their needs and seeing their issues fixed over time. There are already good ideas on right click select but no one to work on them.

Don’t get me wrong: I work with the sequencer daily and I’d love to use the compositor. I just feel there are foundational issues that’ll need to be addressed before you get users hooked.

Absolutely! The real-time compositor would be the biggest “obstacle” in my opinion. But this feature wouldn’t only affect VSE users, but also VFX artists, game devs or regular 3D artists who could comp in different parts of the image with real-time feedback from the 3D-viewport. I think this could just “complete” the idea of eevee

thank you!

Recently, I did some very simple editing of 4k video in the Blender VSE. And one of the biggest problems I also observed was playback speed. Even on a really fast computer (AMD Ryzen 7 1800X, NVMe SSD …), there is no way to get real time playback, as soon as you add a ‘strip modifier’. You have to use the MJPEG proxies in lower resolution, which in my case ruined the color of my clips (most probably because my original footage was 10 bit and MJPEG only supports 8 bit as far as I know).

I really also support the idea of having something like video clip nodes. I would suggest a two step solution for this. First, instead of the ‘strip modifiers’, use nodes to modify each single clip. And secondly a node setup for the whole sequence of clips. In a two step color workflow, you could use the per clip nodes for color correction (correcting the single clip to match the others). Or you could apply video stabilization on a clip. The sequence nodes then can be used to apply a general look for a scene or movie (e.g. color grading).

Some other ideas for improving blender for video work

  • Use CineForm as an intermedate codec instead of MJEG for proxies.
  • Add some general video playback functionality to blender instead of having it only in the VSE. Maybe integrate it into the ‘UV/Image editor’, where we also already have proper scopes.

As I am a developer, I did look into the VSE code. But I have to say that it is not easy to understand and looks very difficult to work with. I am not sure if it would even be better to rewrite the whole sequencer and have it also match and integrate better to the rest of Blenders code and functionality. I really would like to improve video editing inside blender, but I do not have much time for this. And such thing like rewriting the VSE will require some weeks of work (when done in full time). And to do it properly, someone who knows the details of the internal architecture of blender needs to be involved to do this properly.

3 Likes

The VSE is orphaned right now, is in “maintenance only” mode, there’s no developers asigned to it, and AFAIK nobody wants to touch the code with a ten-foot pole and rubber gloves either. The VSE was developed mostly by one coder (Peter Schaile) who is inactive since quite some years and developed on a “feature needed by him” base.

A full rewrite will be required, but the only man with knowledge of it’s internals nowadays (Sergey Sharibin) is quite busy with the Blender internals and will be quite busy for the foreseable future.

A gsoc is not a good frame-time either. Realistically, fixing the VSE or rewrite it will probably take at least 1-2 years, with a full-time developer, since touches many parts of blender (Tracking system, video editor, 3D geometry, ffmpeg, compositor, python, plugins, color management, etc.).

And i don’t think patreon or donations will be enough to pay for it (seeing the dissapointintg results of some developers trying to raise money for more used features).

The best bet is to make a formal and full document explaining the entire VSE situation, what is needed, what needs to be changed/optimized, and asking the BF how much money is required to do that. (and let Ton & Co. find a suitable developer, or lure him/her/they with the power of money.). A forum is not a good place to do it (even though they READ the forums)

Don’t expect any open source developer do this without money and plenty of time. The code is quite complex, involve lots of Blender parts, and doing it right will be HARD.

All of the above, IMHO only.

2 Likes

My idea is to use “compositions” instead of modifiers. That is, you basically assign compositions to video strips. The strips appear in the node editor as nodes and you can start grading/masking/stabilizing them. So to do primary/secondary grading would be quite simple: you create a composition for each strip in your scene and then you create one grade that is applied to all of your strips. This way, you can very easily change the grade of a scene without having to copy any grades to the strips.

I see, thank you for your insight.
This would be quite a complex project to pull off, as it wouldn’t only require the Sequencer to be rewritten, but also the compositor would have to be changed in a basic level. And moreover, many features like masking and tracking. Audio editing is also an aspect that doesn’t even exist yet in Blender. But I do believe many users could greatly benefit from this, it’d make VFX work so much easier, Animators could create their whole films within the same program and even get colorists and editors a serious, FOSS alternative.

Anyways, how should I continue? How do I make that formal document? And do you think there is interest for something like this in the BF?

1 Like

Yes,I think you are right. Compositing basically already has most of the things you need for video work. You just need a way to create multiple composites and use them in the VSE, and of course speed up composition rendering or at least allow caching of rendered frames. Basically, that was what I meant with ‘per clip’ nodes.

First, thanks for the details on the VSE. I do agree that the reworking the VSE is not a small project. But if done right, I think many parts of Blender would benefit from this. And Blender already has some really nice features for video and VFX work, so why not making it feature complete for video work? Maybe it is possible to split this into several sub projects. For example: improve compositing, add some proper video playback functionality to blender, etc.

1 Like

As cool as the proposal sounds, I think the best solution for now is to deprecate/remove the VSE altogether, like the old BGE. The current VSE is of course functional, but pretty basic. It has many ideosyncrasies, too, that make it not exactly user friendly, especially so if you are used to other VSEs. Blender is not primarily a VSE, and IMO the focus should be on its 3D core.

I really agree with that. Yes, it is true that Blender’s focus is to be 3D environment. But that also includes 3D VFX, and that could be drastically improved with a new compositor and a sequencer (which work in conjunction). Blender is currently confusing when it comes to keyframes with video files, often you need to set offsets in multiple locations of your projects to have all in sync.

VSE is still the best free video editor on Linux … so far as I know. Shotcut is good on Windows, but it’s really hard to install on Linux for some reason, so I can’t really recommend it. For this reason, I’m glad it’s getting some love. The idea of connecting it to nodes better would be great - it would make greenscreening in the editor possible for a start.

1 Like

You might be interested in the free version of Da Vinci Resolve which I think goes up to HD and also includes Fusion node compositing. It runs on Linux. :slight_smile:

Upgrade to the Studio version (4K and more features) seems pretty cheep.

IMHO to get this kind of capability in Blender VSE would take years and a lot of resources. Even if everyone was screaming for it and they decided to give it some love.

From my perspective having done a lot of post production work, post production editing and in particular, sound, is really in it’s own class. There are way too many inexpensive alternatives out there with various apps like Audacity (free) and Reaper (basically free if you choose)

You can put together some decent post production solutions these days for free.

Additionally LightWorks has a free version and runs on Linux. Serious limitations in my opinion, but it is an option for some.

Yes, there is DaVinci Resolve and it would do the job, but it has some points I do not like and one of them is that it is not open source. If I want to add some minor features to blender, I can do so. With a closed source application you have to live with what it can do.

But maybe it really would be the best idea to just remove the old VSE from blender and then think about a better solution like they did with the game engine.

1 Like

The idea of integrating nodes into the VSE is an idea that has been around for a long, long, long time. Ton has been against it forever because, frankly, he is absolutely right to believe it is dumb as hell. I can’t find the original discussions in the tracker, but the ideas have been around since before 2007 at least.

For those of you that aren’t aware just how dumb it is, it is wise to think in terms of a “ground truth”. That is, focus on what an NLE should do and what the needs are. To avoid hypotheticals, we could look at something like the Institute’s Agent project as a reference point.

The point of an NLE could vary over such a project. At first, it could be used for previz with still images to block out an animation. Later on, it could be used to bring in updated pre-renders to test the timing and pacing. Towards the end, it could be used to assemble (conform) the final project. So what are those needs?

In the end, an NLE or shot view / strip view is absolutely essential to deliver a time-based presentation of the work. That means performance is above all, most important. With timing being the greatest need for blocking in things, moving them around, testing edits, evaluating rough ideas, tradeoffs will need to be made. Some of those might be:

  • Spatial density. At one time, a 2k frame buffer was an issue. Now it might be 4k. Or 8k.
  • Bit depth. Remember that Agent was rendered using EXRs. That’s 32 bits per channel per pixel. That’s a large amount of data, so making some concession here to hit an accurate frame rate might be required.

Given the need to meet the most important design consideration, we have trade offs being made. But what about quality? We would expect that a hasty adjustment done in the temporary proxy (also known as an “offline” historically) would need to reflect the final output (also known as an “online” historically) close enough to permit creative decisions that match the translation to the final output. How can that be done when most things like colour manipulations and such on Agent need to be done in a 32 bit float environment, featuring potentially hundreds of manipulations / adjustments / etc.?

The answer is, it can’t, and hardware will never “catch up”. This is where the concept of nodes in an NLE fall apart. The idea that you can have the reference standard most important design decision of performance and full quality falls apart. It just can’t happen without a rendering phase in the chain, or a good number of carefully calculated and balanced trade-offs happen between offline and online versions.

There is a balancing act required between the offline representation and the online. Hopefully the offline adjustments one wishes to make as a blueprint translate over into the other tools. The TL; DR is that the problem with NLEs has been solved forever via the offline and online approach. What everyone is seeing with Eevee and Cycles is essentially the exact same design pattern; a rendering engine for an offline where speed is paramount for creative decision making and an online rendering engine where, after the critical decisions have been made, the highest quality output can be generated.

The TL;DR is that any “next generation” VSE should mirror that design, as it has worked tremendously well for a long, long while.

2 Likes

Personally I use Vrgas Pro since version 2 before it had video. So that would be 18 years.

As an audio program initially it was on a par with pro tools. Well not exactly, but it got the job done. And at a lot more affordable price. Additionally it had innovative features and a unique approach to NLE. They brought this same functionality to Video. And at the time these approaches were not widely used. Since then, almost all NLE products use these features or some version of them.

But what not one of these video solutions have ever had was a robust and capable as well as very good sounding audio program integrated.

You had to edit your video with limited audio. And to really get into some serious audio mixing and effects you had to use something like pro tools.

Vegas changed that. At least for those who used it. It took a lot of years and upgrades for Vegas to get the attention of professional editors. But it eventually happened.

There is an old maxim about indie film production. And that is, sadly, that audio is the last consideration. It starts out on the set with poor sound recording techniques and equipment. No regard for prepping the space for recording sound, and no budget for ADR, when the inevitable issues occur on the set that render the sound unusable.

It is interesting that people want to claim Blender as an end to end solution. Without sound this is impossible.

Secondly a VSE that is light years behind even Sony Vegas Video version 3 from 2001. Not even on the map.

But purely as an animation tool, even if you just slice your shots together in VSE, and use Audacity, you have an end to end solutuon. That’s fine.

But in the video editing world this would best be described as a joke. Considering the free or cheap alternatives.

But in an animation pipeline having the VSE is an integral part of the process from animaric and lip sync to final cut. Even if you eventually edit video in an external app.

So the VSE can not be removed without another immediate solution to replace it. Even if equally primitive on first version.

So lets assume that the VSE is wrought with issues even for an animation pipeline. I have not used it in a large production yet. But I am about to. So it will be interesting to see.

I would say in this case a more simple but extensible base module be coded to replace the VSE. And that with this new base in place, functional for an animation pipeline, future development could lead to more modern features.

Thanks a lot for your insight!
I agree with the fact that performance is extremely vital for a usable video editing workflow.
What I don’t completely understand though is, why including nodes into the VSE becomes an inherently bad idea. If there is a quick and intuitive way to manage proxies, what is the issue about it? Say you can render a proxy of any clip, even after they have been composed in the node editor.
I don’t think It’d be a good idea to include the compositor into the VSE the way they are now, but if designed well, I believe it can turn out to be a very useful and unique way of editing and composing videos.

This was very interesting to read, I learned a lot! It was hard for me to imagine how such an implementation would look like in practice. The VSE itself should be more basic and bare-bone in order to preserve real-time playback. Then, how would send your edited clips to color grading or compositing? There seems to be a lot of wisdom in your proposals but it is hard for me to actually picture it :slight_smile: