Fixing Illegal Characters in Filenames

User avatar
ChrisGreaves
PlutoniumLounger
Posts: 11518
Joined: 24 Jan 2010, 23:23
Location: paused.undefined.exposed

Fixing Illegal Characters in Filenames

Post by ChrisGreaves »

This thread is a brief alert on bad filenames.

This Google Search returns a zillion threads telling me that illegal characters are illegal in Windows file names. (Thanks Guys!)
This thread seems to offer a Python solution, but setting aside time to learn and install Python is beyond me right now.
This thread suggests I educate Mac users, but I don’t fancy educating millions of You Tube folks who upload music tracks as MP4 files.
This thread tells me how to strip illegal characters, but that is not my problem.

My problem is RENAMEing the files once found.

The files are found quite easily; my VBA code points to a folder T:\Music\ and grabs 18,000 FullNames of MP3 files, stores them in an array.
Sadly the SHELL function (FI.Path) or the DIR statement returns a Full name with the illegal characters (Unicode chrW(8208) for a hyphen replaced by a legal hyphen (Chr(045))). Only when I go to make use of the Full name am I told “File does not exist”.
I can loop through my string array(18000) and test the existence of every file in the string array and locate 161 Full names that fail. The trouble is that (a) I am not told which characters have been replaced (-, é, ö, and so on) and (b) I cannot issue a rename command (Name xxx AS yyy) because the original name with illegal characters is, well, illegal.

I can accept that Windows designers could not have known that Apple was pursuing a different course, nor could they have known that Asia existed.

What I can’t accept is that Windows quite clearly maintains the fiction that these files exist (I can manually visit the folder, locate the file “Bach ‐ BWV 289 Wir bitten dich, du ewger Sohn.mp3” and manually F2-edit the file name to switch from the illegal hyphen (or whatever) to a legal hyphen) but there appears to be no mechanism provided to correct these names programmatically.
Doing it all manually in a pre-processor pass (LocateIllegallyNamedFiles) to my mind defeats the whole purpose of computers – that they are good at doing Boring And Repetitive Tasks.

Unless anyone knows of a package or technique that allows me to obtain the real (and sometimes illegal) filename of a file and use that real and illegal name to refer to a file.

Cheers
Chris
We hate change, but love variety.

JoeP
BronzeLounger
Posts: 1583
Joined: 25 Jan 2010, 02:12

Re: Fixing Illegal Characters in Filenames

Post by JoeP »

You can probably do it with PowerShell. I don't know the correct syntax but I'm sure there are many examples if you search.
Joe

User avatar
HansV
Administrator
Posts: 69449
Joined: 16 Jan 2010, 00:14
Status: Microsoft MVP
Location: Wageningen, The Netherlands

Re: Fixing Illegal Characters in Filenames

Post by HansV »

See if this old discussion helps: Renaming long file name with illegal characters.
Suggestions found there:

- Use \\?\C:\Test instead of C:\Test
- Enclose the 'wrong' filename in quotes.
- And in a discussion linked to: import the files into a 7Zip archive, then export them.
Regards,
Hans

User avatar
StuartR
Administrator
Posts: 11060
Joined: 16 Jan 2010, 15:49
Location: London, Europe

Re: Fixing Illegal Characters in Filenames

Post by StuartR »

You can make changes to a file that has an illegal name by using the old DOS 8.3 format name
Parse the output of DIR /X at a command prompt to find the 8.3 name to use.
StuartR


User avatar
jonwallace
5StarLounger
Posts: 1033
Joined: 26 Jan 2010, 11:32
Location: "What a mighty long bridge to such a mighty little old town"

Re: Fixing Illegal Characters in Filenames

Post by jonwallace »

Hi Chris
To take a lateral view, if you're renaming MP3s, then you you should take a look at mp3tag. I've used this for years to rename those tracks I've ripped from my cd collection to a standardised format. The filename from tag function is particularly handy, as is the tag from filename function to go the other way. It does much more than this of course (but not too much more :innocent: that it's grown out of its socks) and I suggest that you check out the webpage. It is :free: but you can donate if you find it useful and want to support the author.
John

“Always trust a microbiologist because they have the best chance of predicting when the world will end”
― Teddie O. Rahube

User avatar
ChrisGreaves
PlutoniumLounger
Posts: 11518
Joined: 24 Jan 2010, 23:23
Location: paused.undefined.exposed

Re: Fixing Illegal Characters in Filenames

Post by ChrisGreaves »

StuartR wrote:You can make changes to a file that has an illegal name by using the old DOS 8.3 format name
Parse the output of DIR /X at a command prompt to find the 8.3 name to use.
Stuart this is BRILLiant!
I have begun adding "DIR /s/w/x" as a pre-processor to my task and will report back with findings and code.
Cheers
Chris
We hate change, but love variety.

User avatar
ChrisGreaves
PlutoniumLounger
Posts: 11518
Joined: 24 Jan 2010, 23:23
Location: paused.undefined.exposed

Re: Fixing Illegal Characters in Filenames

Post by ChrisGreaves »

jonwallace wrote:To take a lateral view, if you're renaming MP3s, then you you should take a look at mp3tag. I've used this for years to rename those tracks I've ripped from my cd collection to a standardised format. The filename from tag function is particularly handy, as is the tag from filename function to go the other way. It does much more than this of course (but not too much more :innocent: that it's grown out of its socks) and I suggest that you check out the webpage. It is :free: but you can donate if you find it useful and want to support the author.
Hi Jon, and thanks for th tip.
I shall take a look at the package.

The web site caused a few hairs to trigger on the back of my neck:
"Replace characters or words Replace strings in tags and filenames (with support for Regular Expressions)."
This sounds good on the surface but my experience with Win7HP/Word2003/VBA tells me that renaming a file is good only if you can refer to the file.

Names with illegal characters are sanitized by Win/Word, so that by the time I have built an array of filenames and started working through them, a test "blnFileExists" will return FALSE (or a program will fail) because, of course, the file on the hard drive still has an illegal character and the sanitised name does not refer to an existing file.
Or worse: the sanitised file name refers to a different file that you have been hanging on to for twenty years ...
Cheers
Chris
We hate change, but love variety.