Stripping control characters from strings

User avatar
ChrisGreaves
PlutoniumLounger
Posts: 15619
Joined: 24 Jan 2010, 23:23
Location: brings.slot.perky

Stripping control characters from strings

Post by ChrisGreaves »

Code: Select all

Function strStripControls(ByVal strIn As String) As String
    While Asc(Right(strIn, 1)) < 32
        ' Debug.Print Asc(Right(strIn, 1))
        strIn = Left(strIn, Len(strIn) - 1)
    Wend
    strStripControls = strIn
'Sub TESTstrStripControls()
'    Debug.Print "**" & strStripControls(ActiveDocument.Paragraphs(1).Range.Text) & "**"
'End Sub
End Function
I've lost count of the number of "strStripVbCrLf" functions I've written over the last hundred years or so.
This morning it dawned on me that it didn't matter what the character was, what mattered was that I didn't want it.

An interesting philosophical point, for sure.

http://www.asciitable.com/
There's nothing heavier than an empty water bottle

User avatar
ChrisGreaves
PlutoniumLounger
Posts: 15619
Joined: 24 Jan 2010, 23:23
Location: brings.slot.perky

Re: Stripping control characters from strings

Post by ChrisGreaves »

I've lost count of the number of "strStripVbCrLf" functions I've written over the last hundred years or so.
A better version:-

Code: Select all

Function strStripControls(ByVal strIn As String) As String
    Do While Asc(Right(strIn, 1)) < 32
        strIn = Left(strIn, Len(strIn) - 1)
        If Len(strIn) = 0 Then
            Exit Do
        Else
        End If
    Loop
    strStripControls = strIn
'Sub TESTstrStripControls()
'    Debug.Print "**" & strStripControls(ActiveDocument.Paragraphs(1).Range.Text) & "**"
'End Sub
End Function
There's nothing heavier than an empty water bottle

PJ_in_FL
5StarLounger
Posts: 1100
Joined: 21 Jan 2011, 16:51
Location: Florida

Re: Stripping control characters from strings

Post by PJ_in_FL »

Sorry, the function I posted is native to Excel, but I see now the example given in the previous post is written for Word.

I just ran across this function while building a text processing macro in Excel, and I was curious if this would also work, or is this less efficient (i.e. slower) than the functions posted?

Code: Select all

Function strStripControls(ByVal strIn As String) As String
    strStripControls = Excel.WorksheetFunction.Clean(strIn)
End Function
This requires setting a reference to the Microsoft Excel Object Model in VBA to use in Word VBA.
PJ in (usually sunny) FL

User avatar
ChrisGreaves
PlutoniumLounger
Posts: 15619
Joined: 24 Jan 2010, 23:23
Location: brings.slot.perky

Re: Stripping control characters from strings

Post by ChrisGreaves »

PJ_in_FL wrote:I just ran across this function while building a text processing macro in Excel, and I was curious if this would also work, or is this less efficient (i.e. slower) than the functions posted?
Good call PJ.
Another of those "really cool functions, available ONLY in ....".

From This page i see:
CLEAN Removes the first 32 nonprinting characters in the 7-bit ASCII code (values 0 through 31) from text.
So it does the same as the home-grown variety.
I rather suspect that the overhead of loading part of Excel and transferring controls in and out outweighs a While/Wend loop, but I haven't run the tests to prove it.

The balncing factor, for me, comes with my supposed ability to customize the function:
Function strStripControls(ByVal strIn As String, Optional strRetainCharacters) As String
There's nothing heavier than an empty water bottle

User avatar
ChrisGreaves
PlutoniumLounger
Posts: 15619
Joined: 24 Jan 2010, 23:23
Location: brings.slot.perky

Re: Stripping control characters from strings

Post by ChrisGreaves »

PJ_in_FL wrote:... is this less efficient (i.e. slower) than the functions posted?
No contest!
(attached VBA module).
I even handicapped the inline method by a factor of 100, and it still records 1 second vs. 5 seconds.
That is, it is, roughly, 500 times faster.

Code: Select all

Sub test()
    Debug.Print ""
    Dim dtStart As Date
    Dim dtEnd As Date
    Dim lngTimes As Long
    lngTimes = 10000
    
    ''' First test. Note the multiplier of ONE HUNDRED
    dtStart = Now
    Call Test1(Selection.Paragraphs(1).Range.Text, 100 * lngTimes)
    dtEnd = Now
    Debug.Print Format(dtEnd - dtStart, "hh:mm:ss")

    dtStart = Now
    Call Test2(Selection.Paragraphs(1).Range.Text, lngTimes)
    dtEnd = Now
    Debug.Print Format(dtEnd - dtStart, "hh:mm:ss")
    
    ''' Typical results :-
    '    00:00:01
    '    00:00:05

End Sub
You do not have the required permissions to view the files attached to this post.
There's nothing heavier than an empty water bottle

PJ_in_FL
5StarLounger
Posts: 1100
Joined: 21 Jan 2011, 16:51
Location: Florida

Re: Stripping control characters from strings

Post by PJ_in_FL »

Quite a difference! In Excel the difference is closer to 10x.

I initially didn't notice you're routine is to remove the trailing control codes. Clean removes all control codes from a character string.
PJ in (usually sunny) FL

User avatar
ChrisGreaves
PlutoniumLounger
Posts: 15619
Joined: 24 Jan 2010, 23:23
Location: brings.slot.perky

Re: Stripping control characters from strings

Post by ChrisGreaves »

PJ_in_FL wrote:I initially didn't notice you're routine is to remove the trailing control codes.
Quite so.
I have a set of routines to "do things" in strings - strOnly, strAlpha, strDigits, strAlphDigits etc.
There's nothing heavier than an empty water bottle

William
StarLounger
Posts: 79
Joined: 08 Feb 2010, 21:48
Location: Wellington, New Zealand

Re: Stripping control characters from strings

Post by William »

Chris

Word VBA has a CleanString method, though I've never used it.

William

User avatar
ChrisGreaves
PlutoniumLounger
Posts: 15619
Joined: 24 Jan 2010, 23:23
Location: brings.slot.perky

Re: Stripping control characters from strings

Post by ChrisGreaves »

William wrote: ... CleanString method[/url], though I've never used it.
Hi William, and thanks for the link.
CleanString seems to be as custom a function as mine.
I was concerned with stripping only the right-hand end of a string (paragraph and table-cell sentinels, whereas CleanString dives deep into the heart of the string (I have other functions that do that). Also CleanString replaces some characters, deletes others etc.

My original problem was that of dealing with a data string that comes to me from some sort of structure and removing whatever the host structure thought it needed to delimit the string in storage - hence I attack only the right-hand end.
I should add that most of the time I'm not dealing with a Word document as much as an Act6! data field or something that crawled out of a mainframe!
There's nothing heavier than an empty water bottle