Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Shell Renamer

0.00/5 (No votes)
6 Aug 2002 1  
Shell Renamer is a shell extension which supports regular expression search and replace renaming and filename swapping

ShSwap image

Purpose

Remember back in the good-ole' days of DOS when all file operations were done in the console? I still remember those days. Well I got sick and tired of opening command prompts to do batch renaming so I decided to write a shell extension which would do it for me. This shell extension has two features to it. The first and foremost, is its batch renaming capabilities. You select all the files you would like to rename on your desktop, right-click and select RegExp rename. The second feature is simply swapping filenames. If you have two files selected, you can swap their filenames. Now, how do regular expressions help us rename?

For those unfamiliar, regular expressions have been used for years in UNIX land. They are a powerful tool used for pattern matching. You may not know however, that during a regular expression pattern match, you can save the contents of what you match. This is how my shell extension operates. You specify a regular expression that matches the filenames, it stores the file name, extension, etc, and you specify an output format that it uses to rename files. Sounds confusing? Well in a way it is, but here are some examples to help show its usefulness.

Usage

The shell extension allows batch renaming if more than 2 files are selected in Explorer. If exactly 2 files are selected in Explorer, you can also swap their filenames.

The pattern matching is based on egrep's pattern matching. Since I am using the GNU regular expression library, you can programmatically change how the pattern matching is done. There are options to mimic sed, awk, grep, egrep, etc. The shell extension could be expanded to allow the user to specify exactly which program he wanted to mimic.

Describing how regular expression pattern matching works is beyond the scope of this article. I will just assume you know :) If you don't know, then there is plenty of documentation online and I also suggest O'Reilly's UNIX in a Nutshell as a place to start.

Basically, the shell extension works almost like emacs regular expression search and replace. You match your string (which in our case is filenames) and you replace with a pattern including \[1-9] (\1, \2 etc). The \[1-9] correspond to whatever you matched. For example, if I used (.*)\.txt as my regular expression and I wanted to replace it with \1.txt, it would rename the file to whatever it found. There is an option to specify if the regular expression search will be case sensitive. Here are some sample uses.

Regular expression to matchReplacement formatEffect
(.*)\.txt</td><td><code>prefix\1.txtRenames *.txt to prefix*.txt
(.*)\.txt</td><td><code>\1suffix.txtRenames *.txt to *suffix.txt
CSC200.*</td><td><code>CSc200\1Renames CSC200* to CSc200*

I primarily use this tool when I am working with mp3's. If I download a live show and I don't like the format that the files are named, I can rename them all at once. I also use it to convert cases when I'm working with files. Another use is converting a group of .c files into .cpp files. The possibilities are endless!

To use the utility, first register the extension using regsvr32. Then from within Explorer, select the files you want to rename. Right-click, choose RegExp rename or Swap Filenames. When using RegExp rename, a dialog box will appear where you can fill in the regular expression pattern and the replacement format. When you change these fields, the results of your changes will be shown. That way you can see how the files will be renamed before actually renaming them. Each file can be checked or unchecked. An unchecked file will not be renamed. Hit OK to rename all the files. Once you have hit OK, you will see the status of each file's renaming. A column will be added to the listview which shows the result - either success or the failure reason. A new button will appear - Undo. If you don't like the changes, or some of the results failed, you can Undo. Click OK a second time to exit.

Code

The project is an ATL project that exposes one class, CISHRename. This class derives from IShellExtInit and IContextMenu. The earlier code used STL to store the filenames in a vector, but I have changed that to a simple C++ array. The code for QueryContextMenu just creates and populates this array with the filenames. It separates the files by name and by directory. It needs to know which directory the file is in, because the regular expression matching works on only the filename, not the complete path. (For instance, d.*\.txt would match document.txt, but not c:\documents\document.txt)
HRESULT CISHRename::Initialize(LPCITEMIDLIST pidlFolder,
                               LPDATAOBJECT pDO,HKEY progID)
{
  HRESULT retVal=S_OK;
  char fileName[MAX_PATH];
  UINT iCount;
  UINT i;
  LPTSTR pEndingSlash;

  if (pDO == NULL)
    return E_INVALIDARG;
  
  // get all the files into our data structures

  STGMEDIUM med;
  FORMATETC fe={CF_HDROP,NULL,DVASPECT_CONTENT,-1,TYMED_HGLOBAL};
  if (SUCCEEDED(pDO->GetData(&fe,&med)))
  {
    // get count so i can resize my array

    iCount=DragQueryFile((HDROP)med.hGlobal,0xFFFFFFFF,NULL,0);
    // we have to have 2+ things

    if (iCount < 2)
      return E_INVALIDARG;

    // resize it

    m_files = new RENAME[iCount];
    if (m_files)
    {
      m_count = iCount;
      // go through all files, add them

      for (i=0;i<iCount;++i)
      {
        if (DragQueryFile((HDROP)med.hGlobal,i,fileName,MAX_PATH) != 0)
        {
          // parse out directory name/file name and store them separately


          // get file name

          pEndingSlash = _tcsrchr(fileName,'\\');
          if (!pEndingSlash)
          {
            m_files[i].dir[0] = 0;
            lstrcpy(m_files[i].file,fileName);
          }
          else
          {
            *pEndingSlash = 0; // break it up


            lstrcpy(m_files[i].dir,fileName);
            lstrcpy(m_files[i].file,pEndingSlash+1);
          }
          
          // rename it by default, but set renamed file to nothing 

          // (will get filled later)

          m_files[i].bRename = true;
          m_files[i].renamedFile[0] = 0;
        }
      }
    }
    else
    {
      m_count = 0;
      retVal = E_OUTOFMEMORY;
    }
    ReleaseStgMedium(&med);    // don't forget to clean up

  }
  return retVal;
  
}

Since it works with renaming, it stores separately the filename and the directory of each file. At first I was using GetFileTitle to pull out the file name, but after a careful rereading of the documentation, this will not work if the user hides file extensions for registered filetypes. So I reverted back to manual parsing.

In QueryContextMenu we add the menu items. If the count of files is 2, we also must add the Swap Filenames item.

HRESULT CISHRename::QueryContextMenu(HMENU hmenu, UINT indexMenu,
                                     UINT idCmdFirst, UINT idCmdLast,
                                     UINT uFlags)
{
  UINT idFirst=idCmdFirst;
  UINT iCount;
  if (uFlags & CMF_DEFAULTONLY )
  {
    return MAKE_HRESULT ( SEVERITY_SUCCESS, FACILITY_NULL, 0 );
  }
  else 
  {
    iCount=1;
    // if we have two exactly, make the swap file name option visible

    if (m_count == 2)
    {
      InsertMenu(hmenu,indexMenu++,MF_STRING | MF_BYPOSITION,
                     idFirst+1,(LPCTSTR)_T("Swap filenames"));
      ++iCount;
    }
    
    InsertMenu(hmenu,indexMenu,MF_STRING | MF_BYPOSITION,
                   idFirst,(LPCTSTR)_T("RegExp rename"));
    
    return MAKE_HRESULT(SEVERITY_SUCCESS,FACILITY_NULL,iCount);
  }
  return E_FAIL;
}

The code for the renaming is actually quite simple because the regular expression matching is done by the GNU library. The dialog box procedure takes care of all the details. When the user changes the text, the dialog box updates all the variables associated with each file (such as the new destination name). Then when the user hits OK, it performs the MoveFile's that are necessary to do the renaming. Here is the code which handles when the user types.

// Updates the list view based on what the user types in the boxes

static void ChangeListView(HWND hWnd,LPDDATA pData)
{
  char regExp[MAX_PATH];
  char rename[MAX_PATH];
  LVITEM item;
  regex_t regt;
  regmatch_t regMatch[10];
  UINT i;
  unsigned int regFlags;
  HWND hList = GetDlgItem(hWnd,IDC_LIST);

  // reset flags

  for (i=0;i<pData->count;++i)
  {
    pData->pRename[i].bRename = false;
    pData->pRename[i].renamedFile[0] = 0;
  }

  // get the dialog text

  if (GetDlgItemText(hWnd,IDC_EDITREGEXP,regExp,MAX_PATH) != 0 &&
    GetDlgItemText(hWnd,IDC_EDITRENAME,rename,MAX_PATH) != 0)
  {
    // text was good

    // now compile it, and try to run everything through

    re_syntax_options = RE_SYNTAX_EGREP;
    regFlags = REG_EXTENDED;
    if (pData->bCaseInsensitive)
      regFlags |= REG_ICASE;
    // if it compiles

    if (regcomp(&regt,regExp,regFlags) == 0)
    {
      // now check it for each item

      for (i=0;i<pData->count;++i)
      {
        memset(regMatch,-1,sizeof(regMatch));
        // if it matches the file name

        if (regexec(&regt,pData->pRename[i].file,10,regMatch,0) == 0)
        {
          if (RegMoveFile(regMatch,pData->pRename[i].file,rename,
                          pData->pRename[i].renamedFile,MAX_PATH))
            pData->pRename[i].bRename = true;
          else
            pData->pRename[i].renamedFile[0] = 0;
        }
      }
      regfree(&regt);
    }
  }
  // ok now go through each item and fill the list view

  for (i=0;i<pData->count;++i)
  {
    item.mask = LVIF_TEXT;
    item.iItem = i;
    item.iSubItem = 1;
    item.pszText = pData->pRename[i].bRename ? 
                      pData->pRename[i].renamedFile : "<error>";
    SendMessage(hList,LVM_SETITEMTEXT,i,(LPARAM)&item);
  }
}

The RegMoveFile function does the most work. It uses a DFA to create the formatted string. It expands all the \1's, \2's etc.

// this function takes the regmatch_t array from regexec and the filename to be

// renamed and the string used for renaming.  It parses the string and picks out

// each \1, \2, \3 etc and replaces it in the string with whatever you specified

// in the regexp match.  

// Example 

bool RegMoveFile(regmatch_t *pMatch,const char *fileName, 
                 const char *outputFormat,char *output, int maxLength)
{
  // a handy macro which won't overwrite the buffer.  Overwrites 

  // are possible because a user can specify a regular expression 

  // and then rename the file as \1\1\1\1\1\1, which could potentially

  // create a gigantic string, overflowing

#define APPENDCHAR(c) if ((--maxLength) > 0) *(s++) = c; else goto exit;

  char *s=output;
  char c;
  int index;
  int length;
  
  // our finite state automata

  enum STATE {S_START,S_SLASH,S_END};

  STATE state=S_START;
  while (state != S_END)
  {
    c=*(outputFormat++);    // chew up next character

    switch (state)
    {
    case S_START:
      if (c == '\0') state=S_END; // null terminated?  we're done

      else if (c == '\\') state=S_SLASH;  // slash? enter slash-found state

      else APPENDCHAR(c);      // anything else, just append it to s

      break;
    case S_SLASH:
      if (c == '\0')    // terminated with \?  just add to s and quit

      {
        APPENDCHAR('\\');
        state=S_END;
      }
      else if (isdigit(c))      // digit?  

      {
        index=c-'0';

        // make sure it's valid

        if (pMatch[index].rm_so < 0 || pMatch[index].rm_eo < 0)
          return false;

        // get length of match

        length=pMatch[index].rm_eo - pMatch[index].rm_so;
        
        // make sure we have enough room for it now

        if (length < maxLength)
        {
          // copy into buffer

          strncpy(s,fileName+pMatch[index].rm_so,length);
          // increment s pointer accordingly

          s += length;
          maxLength -= length;
        }
        else
          goto exit;

        state=S_START;
      }
      else // it's just another character, add to s

      {
        APPENDCHAR(c);
        state=S_START;
      }
      break;
    case S_END: // do nothing

      break;
    }
  }
  // null terminate when we come out

exit:
  *s=0;
  return true;
}

Final thoughts

The code compiles and runs under Win9x and Windows 2000. I havn't tested NT 4.0 but it should work. It runs as ANSI because the regex library requires ANSI strings.

Within the source code, I've included compiled copies of the GNU regex library as static libraries. Also the regex.h header file is included. For those interested in downloading the source, it can be found at ftp.gnu.org. Much thanks goes out to the GNU folk.

Revision History

30 Jun 2002 - Moved image to below downloads

27 Jul 2002 - Updated article after my rewrite

7 Aug 2002 - Updated article to fix a few things I missed

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here