Tag Archives: regular expressions

Automate VFX Sequence Titles

Automate VFX Sequence Titles

This tip comes by way of George McCarthy, who was our VFX Editor extraordinaire on Mission: Impossible 4. I also created an online EDL to SubCap Converter you can use in lieu of the more manual way described below.

If you’re on a show that has to turnover a sequence to a VFX house, you’ll likely find the need to export a Quicktime of that sequence with titles over any shot in the sequence that will be a VFX shot. This reduces confusion between Editorial and the VFX vendor, and is useful not only to label each shot with the shot ID, but also because it’s not always obvious which shots are supposed to have work done to them. Makeup fixes, for example, wouldn’t be immediately obvious when scrubbing through a Quicktime, but if you title the shot then it’s easy for the VFX vendor to match your count sheets to a visual reference.

SubCap Example

Before Avid added Generator clips to the effect palette, you had two options for titling your sequence. One was to manually type in shot names, durations, etc., and save each title individually in a bin. The other was to attempt to use the Autotitler function that existed in Marquee, though almost immediately after Marquee came out, Avid broke the Autotitler in a software update and left it that way for years. In either case, you’d still be left with the task of manually cutting in titles over each shot, which is a tedious and error-prone task.

There are now multiple methods for doing this in a more automated fashion, including one I just learned about that uses the Timecode generator plugin over a subclip, but in this article I’ll look at using the SubCap effect and feeding it a text file converted automatically from an EDL.

Prepping an EDL

There are a couple reasons why an EDL is handy for generating a subtitle file. The first is that you can include locators in the EDL, so you can reuse the locators you’ve already created in your sequence that list each shot’s ID. The second is that by using an EDL instead of a straight locator export, you can get timecode ins and outs for the shot, so that the subtitle is the appropriate length. From this you can also calculate the duration if you’d like to include that in your title.

So the first step is to make sure your sequence is ready, you’ve run the Commit Multicam Edits command, and all of your VFX shots have a locator somewhere on them that lists the correct shot ID.  Save your bin and open the sequence in EDL Manager. You do not have to lift out non-VFX shots from your sequence, but you do need to make sure that Locators are turned on in the EDL settings, and that your EDL type is CMX3600.

Make sure Locators are enabled in the EDL settings

 

Export an EDL from the video layer where your VFX locators exist, and then either use the converter I created to do it for you, or bring that into a text editor that allows regular expression Find & Replace (such as TextMate or jEdit). This is the regular expression I use to grab all the right bits from the EDL

\d{3}[^\n]*([0-9:]{11})\s([0-9:]{11})\s?\n(?!\d{3})(?:.*\r?\n(?!\d{3}))*?\* LOC: [\d:]{11}\s(\w+)[^\S\n]+([^\r\n]+)\r?\n?
Looks scary, I know. But that one line of gibberish looks for any series of lines in an EDL that include an EDL event next to a Locator comment. When it finds one, it saves the timecode in and out for the sequence as well as the color and text from the locator. With all that information saved you could choose whether or not to use the color to handle a certain color of locator differently than another, or to calculate a duration based off the timecodes. Using backreferences, you could fill in your Replace field with $1 $2\n$4\n\n, for example, and that would give you the format you need for a SubCap file. This RegEx won’t get rid of all of the non-vfx EDL events that you would want to ignore, so you’d have to go through and do that manually remove those lines or write a RegEx that negates the one above. Don’t forget to add the opening and closing tags, too. A small sample of the final product of a SubCap file looks like this:
<begin subtitles>

04:00:00:00 04:00:08:00
CS0010 (FORMERLY CS1000)

04:00:42:22 04:00:51:00
CS0020

<end subtitles>

Importing into the SubCap effect

Once you’ve got your SubCap text file, throw a SubCap effect on an empty video layer and go to Import Caption Data to bring your titles in. Make your adjustments for appearance (make sure to check out the Global Properties pane as well), and optionally you can save a stylesheet for the future so you only have to make those adjustments once.

SubCap Effect Panel

 

Check Your Work!

This is the last step, and it’s very important. Just because the process is automated doesn’t mean that there wasn’t an error, or that your source EDL was perfect. Check your sequence to make sure it has everything it’s supposed to and nothing extraneous. Even on small shows there can be a lot of hands in the locator jar, and you might find an errant locator buried in a nested clip, or a missed two-cut shot that got separated from its locator. If you need to add a title, it’s easy to do so from the SubCap effect editor).

Timeline with SubCap-Imported Titles

Converting Avid Locators to DVD Studio Pro Chapter Markers

Converting Avid Locators to DVD Studio Pro Chapter Markers

I have two methods for making DVDs for directors, depending on how quickly the DVD is needed and what quality level or features are required. For faster but lower-quality DVDs, I just play out to a standalone DVD Recorder. For higher-quality DVDs I usually export a Quicktime, run it through Compressor, and ultimately bring it into DVD Studio Pro. When I go the DVD Studio Pro route, I also like to take the opportunity to appropriately chapter my DVDs. It’s a pretty quick and easy thing to do, and provides a more meaningful set of chapter points than the automated ones you get on a standalone recorder. My favorite method for quickly chaptering a Quicktime uses Avid locators to provide Compressor and DVD Studio Pro with my chapter points. Below is the process I use, or for the shortcut just use the online tool I created.

TOOL: Locator to Chapter Marker Converter

Converting Avid Locators to Compressor/DVDSP Chapter Markers

In order to convert Avid locators to chapter markers, a little knowledge of regular expressions is required. You also need an advanced text editor that can handle regular expressions (I use the free and multi-platform jEdit). You do not need to add locators to your sequence before you export it. Exporting your sequence and making locators suitable for chaptering can be done in either order. You do need to have exported and processed your locator file before you go into Compressor, though.

Adding Locators to Your Sequence

To prepare your sequence, change the sequence Start time to 00:00:00:00. I normally clear out all the other locators in my sequence as well, and since I always copy a sequence to a new bin before I export it, removing the locators isn’t an issue for me. If you’d like to keep your existing locators, then just add a new video track and put all of your chapter locators there.

Once your pre-existing locators are cleared and/or you’ve got yourself a new video track to work with, go ahead and add a locator at every point you’d like to make into a new chapter on your DVD. Once all your locators are added, go to the Locators window and export them to a text file. If you’ve chosen to add your locators to a new track while keeping existing locators on other tracks, then sort the Locators window by track, select all the locators you just added, and export only those Selected locators to a text file.

Exporting Avid Locators

Exporting Avid Locators

Converting the Locators Text File to a Chapter Marker File

Once you’ve got the text file with your exported locators, it’s time to open it in your advanced text editor. An exported locator file initially looks like this:

Evan Schiff    00:02:18:09    V4    white    Chapter1
Evan Schiff    00:03:11:11    V4    white    Chapter2
Evan Schiff    00:04:57:18    V4    white    Chapter3
Evan Schiff    00:06:41:16    V4    white    Chapter4
Evan Schiff    00:07:37:06    V4    white    Chapter5
Evan Schiff    00:08:29:05    V4    white    Chapter6
Evan Schiff    00:12:33:04    V4    white    Chapter7
Evan Schiff    00:13:33:23    V4    white    Chapter8
Evan Schiff    00:15:19:11    V4    white    Chapter9
Evan Schiff    00:16:14:15    V4    white    Chapter10
Evan Schiff    00:17:44:20    V4    white    Chapter11

Using a regular expression search/replace, you can instantly reformat your locator file to conform to Compressor and DVDSP’s required marker file format, which looks like the text below:

00:02:18:09 Chapter1
00:03:11:11 Chapter2
00:04:57:18 Chapter3
00:06:41:16 Chapter4
00:07:37:06 Chapter5
00:08:29:05 Chapter6
00:12:33:04 Chapter7
00:13:33:23 Chapter8
00:15:19:11 Chapter9
00:16:14:15 Chapter10
00:17:44:20 Chapter11

If you don’t label your locators, all you’ll see in the marker file is timecode. This is fine, the only required data is the timecode, and DVDSP will automatically add chapter labels if you don’t supply them.

To reformat the file, you would use the following regular expression:

^[\w\s]*(\d{2}\:\d{2}\:\d{2}\:\d{2})\tV\d\t\w+\t(.*)$

Your text editor’s dialog box would look something like the image below. I usually do a couple Replace & Find clicks just to make sure the pattern is working before I hit Replace All and save the file.

Regular Expression Search/Replace

N.B. In the Replace dialog box, your text editor may use either $1 or \1 to make back-references.

Importing Markers into Compressor

After you’ve saved your new marker file, open up Compressor and load your Quicktime into it. Make sure the Quicktime is selected in the main window and displaying in the Preview window. To the right of the Preview window’s timeline, you’ll see the marker button, under which you can Import Chapter List. Select your file, and if everything’s been done correctly, you should see markers pop up in the appropriate places in your movie.

Compressor Menu to Import Markers

Compressor Menu to Import Markers

Successfully Imported Markers

Successfully Imported Markers

Once your markers and DVD export settings are ready to go, Submit the job. Once the render finishes, you can import it into your DVDSP template file, and as soon as you lay it down onto a track you’ll see that the markers/chapter points appear automatically.

Why can’t I just add chapter points directly in DVD Studio Pro?

You can, but there are several limitations you’ll be imposing upon yourself. First and foremost, if you make your m2v without adding the markers first, you will not always be able to add a chapter point at the exact frame you want to. To get technical for a minute, when you encode an m2v file the frames of your source Quicktime are compressed into three types of mpeg frames. These are I-frames, P-frames, and B-frames (read your Compressor or DVDSP User Guide for more detail). You can only place a chapter marker on an I-frame, and if you don’t tell the encoder where you want an I-frame to be, it will put them wherever it sees fit, and this may or may not be on the frame that you want to make into a chapter point.

Second, the Compressor and DVDSP timelines are not nearly as precise as Avid’s, and don’t give you the visual identification of where clips start and end. I find it much faster to put accurate chapter points on the Avid timeline than in either Compressor or DVDSP.

You can put your chapter points wherever you like in DVDSP if you import the Quicktime directly and let DVDSP make your m2v file, but then you lose all the advantages of using Compressor, not to mention still having to deal with DVDSP’s clumsy timeline.

An Introduction to Regular Expressions

Regular Expressions are definitely one techie level up from your traditional tech tip, but they’re definitely worth the time to learn, even if you only learn the basics. A regular expression is very much like a math formula, and you use them when you want to find (and replace) pieces of text using a condition instead of knowing the text in advance.

Conceptual Examples

For example, let’s say you have a file that contains a bunch of phone numbers. And let’s say those phone numbers are all written out as “8005551212”, but you want them to look like “(800) 555-1212.” Using a text editor that supports a regular expression Find & Replace, you could easily reformat all of those phone numbers to include parentheses and a dash, without going row by row to manually change them all. Since you know that your phone number is a string of 10 continous digits, you can tell your text editor to find all instances of 10 numbers in a row, and to insert a ‘(‘ before the first digit, a ‘) ‘ after the third, and a ‘-‘ after the sixth.

A second example is as follows. You have a database full of vfx shot names and shot durations. You also have a sequence full of vfx shots you need to turnover, and every one of them needs a title added to it dictating the shot name and how many frames it is. You can export the information you need from your database, but only as a comma-separated values file (.csv). This would give you results such as:

VFX_001,25
VFX_002,38
VFX_003,119
VFX_004,350
VFX_005,8

To create all those titles, you know that you can use the Autotitler function in Avid Marquee, but the text format it requires is different from CSV, resembling something like this:

VFX_001
25

VFX_002
38

VFX_003
119

VFX_004
350

VFX_005
8

So to reformat your CSV file into the format that Marquee requires, you can tell your text editor to replace every comma in your file with a line break, and to turn every pre-existing line break into a double line break.  You can do this with one regular expression that both replaces the comma and adds a second line break, but I sometimes like to break it up into separate steps to keep things simple.

To demonstrate how to do this find/replace, I’m going to double the line break before I replace the comma. This way I can be sure that I don’t add more line breaks than I need. So the first find and replace would look like this:

Find: \n
Replace: \n\n

In regular expressions, “n” is the notation you can use for line breaks (sometimes also “r” is used either instead of or in conjunction with “n”, but you can google the difference on your own). So what this find/replace does is search for a line break and replace it with two. Then, you can probably guess what to do with the commas:

Find: ,
Replace: \n

This will give you the format you need for the Avid Autotitler.

Lastly, you can also use Regular Expressions in many file renaming utilities (NameMangler is one I use), so if you need to rename a bunch of files in order to conform to a certain pattern, regular expressions can help. One instance where you might use this would be to conform a bunch of irregularly named files in order to put them in sequence for import into an Avid bin.

Regular Expression “Variables”

What the example above is intended to demonstrate is the concept of searching for a pattern of text, rather than knowing what text you’re searching for in advance.  And in order to search for patterns, you must be able to use placeholders to represent certain characters or groups of characters.

This Regular Expression Reference lists the different placeholders you can use when searching text. The ones you’ll use most often are:

  • \d : Finds any numerical character (ie. 0-9)
  • \w: Finds any word, with a word being defined as a group of alphanumeric characters or an underscore, but not including a space
  • \s: Finds any whitespace, including a space, tab, or line break
  • \t: Finds any tab character
  • [ and ] : If you wish to limit the characters you’re searching for, put those characters inside of [ and ]. So for example, [A-Za-z5-8] would find any character from A-Z regardless of uppercase or lowercase, as well as any number between 5 and 8

You will often need to specify how many characters you’re searching for, in which case you’ll need these basic placeholders:

  • ?  : A question mark after a character or character class denotes that you are looking for 0 or 1 instance of that character
  • *  :  An asterisk denotes you are looking for 0 or more of that character
  • +  : A plus sign denotes you are looking for 1 or more of that character
  • { and }  : These brackets allow you to say exactly how many characters you want to match. For example, “\d{2}” tells the program you’re searching for a string of exactly two digits.  “\d{2,8}” tells the program you’re searching for between 2 and 8 digits, and “\d{2,}” specifies that you’re searching for at least 2 digits.

And lastly, you’ve seen the backslash (\) used a lot here, and that’s worth explaining. In regular expressions, the backslash functions as what’s called an escape character. The rules of regular expressions are a bit complex, and many characters you may want to search for have functional meanings, like the fact that an asterisk (*) tells the program to match 0 or more characters. If you want to search for an asterisk, though, you may need to escape it. And you do that by putting a backslash before the asterisk, like so: \* .  By using the backslash, you are either telling the program to ignore the special meaning that a particular character has, or to match a character that is not easily defined (like \t, which represents a tab character).

Back References

The last concept I want to explain can be tricky to get your head around while you’re still digesting everything else, but it’s a very useful thing to know, and is called a back reference. Let’s take the timecode example below… In this situation, you have a bunch of timecodes without colons (:) separating the hours, minutes, seconds, and frames (ie. 01020304). You want to insert the colons, but you need a way to tell the program not to throw out the digits that make up the timecodes when replacing the timecode text. So to do that, you have to save those digits during the Find part of the process for use during the Replace part. You do this by enclosing the text you want to save in parentheses, as so: (\d{2})

Then, in your Replace expression, you can tell the program to insert the text it’s saved by including $1, $2, $3, and $4. The first parentheses in your Find expression are referenced by $1, the second by $2, and so on. And when replacing the timecodes, if I put a set of parentheses around every 2 digits, that will allow me to then insert colons between those pairs of digits, thus giving me properly formatted timecode in the form of 01:02:03:04.

Examples

The easiest way, I think, to grasp how you use all these placeholders is to show you some examples, some of which come from this Regular Expressions site.

Email Address:

This will match most email addresses,

\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}\b

and is broken down like this:

  1. \b matches a word boundary (most likely a space)
  2. [A-Za-z0-9._%+-]+ matches any alphanumeric character, regardless of case, as well as the punctuation also enclosed within the brackets. The + sign at the end states that you are looking for 1 or more characters that match this pattern, since most email addresses are more than one character long.
  3. @ simply matches the @ sign in an email address
  4. [A-Za-z0-9.-]+ will match the server name in your email address (ie. it will match the “gmail” in “me@gmail.com”)
  5. \. will match the dot between your server name and your top-level domain (ie. it will match the “.” in “me@gmail.com”)
  6. [A-Za-z]{2,4} will match the .com, .org, .net, .info, or whatever you happen to have, by matching 2-4 alphabetical characters
  7. \b again matches a word boundary, presumably a space or line break

Timecode:

This will match timecode, which I’ve used in the past to reformat a subtitle file from an Excel-exported CSV into a DVD Studio Pro formatted .stl file. Below is my source file, which as you see is missing the “:” in all of the timecodes.  The .stl file requirements ask that there be a space on eiher side of the commas separating the TC and subtitle text, which I did in Excel by concatenating several cells into one column with the appropriate comma spacing.

Source CSV File:
01052021 , 01052328 , …but now I wonder if it is just the fear talking.
01052419 , 01052800 , I'd like to say I'm the son of a famous person.
01052812 , 01053106 , Or at least someone who is politically affiliated.
01053128 , 01053406 , But that is not the truth.
01053427 , 01053619 , So the only reason I can  think of…
01053800 , 01053922 , …is money.
01055425 , 01055914 , -Be reasonable. | -Don't you understand we make the rules here?
01060003 , 01060209 , We will give you exactly what you want.
01060210 , 01060517 , -Make sure of it. | -It has been 19 days!
01060518 , 01060625 , That doesn't matter anymore.
01060626 , 01060815 , It does matter.
01061309 , 01061603 , I already gave you till noon.
01062605 , 01062800 , Don't do this.
01063622 , 01063729 , Wait.

To find and replace the timecodes, I would use these patterns:

Find: (\d{2})(\d{2})(\d{2})(\d{2})
Replace: $1:$2:$3:$4

Breakdown:

  1. (\d{2}) searches for strings of 2 digits, and the fact that the \d{2} is within parentheses means that the program will save the two digits it finds so that I can reinsert them when replacing the text. Since I know my timecode is 8 digits long, I put four of these statements in a row so that I can keep the hours, minutes, seconds, and frames separate.
  2. $1:$2:$3:$4 replaces every 8-digit timecode string the program finds with the first two digits it saved from the parentheses, followed by a colon, followed by the second pair of digits, then a colon, etc. This is a back reference, as mentioned above.

Conclusion

I’ll add to this article as new examples and uses arise, but hopefully this and Google will get you started on figuring out all the different ways you can use Regular Expressions. If you’re confused about when to use them, just stop yourself when you find that you’re in a position of having to make a bunch of tedious edits to a text file. It may be that you can save yourself a lot of time and typing by using a Regular Expression Find/Replace.

Additional References:

PHP: Regular Expression Details