Premiere Pro CS5 with NVidia CUDA

While all of the Adobe applications have been updated in CS5, clearly Premiere Pro is the centerpiece of this release.  Adobe has been touting the Mercury Playback Engine for months, with new 64bit code, and additional GPU acceleration through NVidia’s CUDA technology.  This acceleration allows highly compressed formats like AVCHD and H.264 to be played back seamlessly in the timeline, and intercut with other formats without transcoding intermediate files or rendering previews.  Premiere Pro now supports native editing of a stunning number of acquisition formats, including HDV, AVCHD, XDCam-HD and XDCam-EX, DVCProHD and AVC-Intra files from P2 Cards, Red R3D files, and my favorite: Canon H.264 DSLR footage.  It can also edit DNxHD and ProRes footage, for greater compatibility with Avid and Final Cut Pro.  DPX sequences are another significant addition to the formats natively supported, for both import and export.  This greatly enhances Premiere Pro’s usefulness as a DI conform tool, especially when combined with native support for so many other source formats.

The greater memory footprint made available by 64bit coding allows larger projects to be loaded without having to swap data into virtual memory, which causes a major performance hit.  This allows longer complex sequences, and more importantly: greater numbers of source clips to be imported without any noticeable decrease in system performance.  The one point where large projects still incur a penalty is during load time, since regardless of how much RAM you have, more data has to be loaded into memory.  Even my projects with over 500 clips usually load within a minute, which is a vast improvement over previous versions.  Loading media in the background once the UI is available to the user is a feature that has been available in Premiere ever since the 4.2 update.  While it is nice to see your sequence on screen during that time, I wouldn’t recommend trying to do any real work until all of the media is loaded, because you will usually see a significant decrease in both performance and stability while the system is busy linking to all of your media files.  Certain files load faster than others during this process, so load times may vary depending on the format of your source footage, regardless of your project’s complexity.  Specifically I have noticed that DSLR MOV files take longer to load when opening a project.

Speaking of DSLR files, Adobe has totally reinvented the way they are handled in CS5.  Most applications, including the CS3 and CS4 versions of both Premiere and After Effects, use QuickTime importers to access the content of Canon DSLR files.  This makes sense, since they are stored in an MOV wrapper, but leads to two issues.  One is that is specific to Adobe is that on a PC, QuickTime files go through a few extra steps before they are accessed by the application, so there is a performance hit, and with lots of files accessed at once, there are usually stability issues as well.  The other issue effects all applications that use QuickTime to access DSLR files, and that is that ever since QuickTime 7.6.2 was released, Canon DSLR files have been decoded in a much more flat and washed out color space than they were designed to be viewed in.  Prior to version 7.6.2, they were decoded in a way that clipped the highlights and shadows, which was even worse.  In CS5, Adobe worked with Mainconcept to create an importer that reads the DSLR source files without involving QuickTime at all.  This alleviates both the performance hit on PC systems, and the color space issues of QuickTime’s default decoding.  A lot of work was put into getting the decode matrix and color space exactly correct based on the processing that Canon’s hardware does to the file in the camera.  This should allow CS5 to decode the files more correctly than any other application that I am aware of, and give more options for color processing at later stages in the workflow pipeline, since more of the original color data is preserved.

The Mercury playback engine has a significant impact on the user experience, with most frequently used tools being available in real time.  Supposedly most of the decode and playback improvements are based on the new code written for native 64bit execution, with the GPU offload limited to effects processing.  While many editors don’t use very many discrete effects in their work, there are some intrinsic playback functions like scaling frame sizes and adapting frame rates that are considered effects and offloaded to the GPU.  This allows content of different frame rates and resolutions to be intercut seamlessly on the timeline, and it is truly seamless.

I have occasionally found myself editing in the wrong sequence frame rate without even noticing it, since the software makes the conversion on the fly.  Even more frequently I have found a clip shot in the wrong frame rate on a tapeless camera almost escaping detection because gone are the red render bars and playback glitches that used to stand out.  This allows editors to import media from many different sources without prerendering everything to a normalized format.  I used to spend about a quarter of my time at work converting strange source footage into 1080p24 intermediate files, because any footage not matching the timeline format would cause previous versions of Premiere to glitch during playback and occasionally crash.  Now I would recommend carefully converting any footage used in a final piece to the correct format for maximum control, but this step can now be put off until the online stage, since it is no longer required for playback and stability.  Since 90% of footage usually ends up on the cutting room floor, putting off these time consuming conversions until after the creative edit is finished, will drastically reduce the amount of footage you end up needing to process.

So this clear increase in performance leads to the question of: how far can you push it?  A few months back I processed an ISO noise test in After Effects CS4 for Shane Hurlbut.  We were comparing the image noise produced at twelve different ISO levels on the Canon 5D, and the project involved twelve streams of video with masks, levels, and position adjustments for a tiled view.  I was getting about two frames per second when rendering previews, which seemed reasonable considering the amount of processing involved.  When I saw the list of GPU accelerated effects in Premiere Pro CS5, and was asked to create torture test for Adobe to show off at NAB, this jumped to mind.  I recreated the entire project, using twelve layers of native DSLR footage, each layer having a motion effect and a 4-point garbage matte to create the tiling, and a color correction applied to exaggerate the noise to a clearly viewable level.  The same basic setup that was getting 2fps in AE CS4 played back in real-time in Premiere Pro CS5.  (Dual Xeon X5365 CPUs and 16GB RAM with a QuadroFX 4800) Needless to say I was quite impressed with the outcome, since it was deliberately beyond Adobe’s ten-layer playback claim, and using a complex format to decode and playback.  Clearly GPU acceleration can have a dramatic impact on application performance.

There has been much discussion and debate on tech forums and blogs about the specifics of Premiere Pro’s hardware support for CUDA acceleration.  Adobe has severely restricted the number of cards for which they officially support CUDA based GPU acceleration, to maintain control over the hardware environments upon which their accelerated code is tested, supposedly for stability reasons.  The official list is limited to QuadroFX 3800, 4800, and 5800, as well of the discontinued GeForce 285 GTX, with certain limitations, for those on a lower budget.  There are currently no officially supported mobile GPUs, even though notebook CPUs are usually more in need of a performance boost than desktop chips.  This may be due to the fact that even the newest mobile QuadroFX 3800M is still based on the G92 core from the GeForce 8000 series, but I don’t like seeing software artificially limited in regards to performance or hardware support, and this is an example of both.  Don’t confuse legitimate limitations and artificial limitations, since clearly a powerful GPU is necessary for optimal performance in CS5, but there are cards of equal capability that are specifically excluded from the list, supposedly for stability reasons.  Luckily Adobe has left an option for knowledgeable users to override some of those artificial limitations, and I anticipate seeing them being dropped completely in a future update.  I anticipate a more reasonable requirement of any NVidia card supporting CUDA 1.1 or 1.3, with at least 785MB of video memory, at some point in the future.

While Premiere Pro CS5 is not perfect, it is a complete reversal from the previous fiasco that was CS4.  It clearly demonstrates the possibilities provided by GPU acceleration, resting solidly on 64bit code with proper multithreaded programming, it scales to take advantage of whatever hardware is made available for it.  Since Adobe has made a practice of introducing significant improvements in incremental dot releases, I am looking forward to seeing how else they refine it in the coming months.

FTC Disclosure: I have been on Adobe’s beta team for many years, and Adobe has provided me a copy of CS5 for this review.  NVidia has provided me with graphics hardware in the past, which I utilized in this review.  My only admitted personal bias is my preference of Windows over OSX, because I like full control over every aspect of my computing experience.  If for some reason that bothers anyone, there are plenty of other sources of information on the internet, but I try to provide unique insight on how each of these tools fits into the larger post-production picture.  Any relevant critique or response is welcomed.

Leave a Reply

Your email address will not be published. Required fields are marked *