split up a big text file

CData
3StarLounger
Posts: 308
Joined: 24 Dec 2015, 16:41

split up a big text file

Post by CData »

greetings, hope all is well. Have a task where I could use your advice.

Have BigFile.txt from another system that writes all its table data into 1 big file.

The file is tab delimited. I need each individual table data set written to its own txt file so they can be imported into Access tables.

BigFile.txt format is uniform where the layout has the table name followed by data rows..so i.e. it is like this:
%R some info
%R some info
%T tblAddress
%R row data
%R row data
%T tblProduct
%R row data
%R row data
%R row data
etc

The row count is random. Each table data set is a different number of fields in the row data.
But with the clear demarcation of %T one should be able to write out the rows beginning just below tblAddress and ending just above the next %T row into a tblAddress.txt

I could first add row numbers to BigFile.txt if that helps, though not sure if it does.
Is there a viable VBA way to do this?

User avatar
Doc.AElstein
BronzeLounger
Posts: 1499
Joined: 28 Feb 2015, 13:11
Location: Hof, Bayern, Germany

Re: split up a big text file

Post by Doc.AElstein »

Hi
One way that a text file is handled in VBA is to import it as a single text string. Then the first thing that is done typically with that text string is that it is Split by the line separator ( typically a vbCr & vbLf ) in order to get a 1 D array where each element is a single line of the text file ... , then those lines may be manipulated, for example by Splitting by the Tab or comer separator to get each data element ........ and so on… and so on…
Here is a couple of recent Lounge post examples:
https://www.eileenslounge.com/viewtopic ... 04#p269104 https://www.eileenslounge.com/viewtopic ... 25#p269105
https://eileenslounge.com/viewtopic.php ... 67#p274367 https://eileenslounge.com/viewtopic.php ... 68#p274368

What you could do instead is import your text file as a long string in the same way but then split by %T
That would give you a 1 D array where each element is a text string of the sub text files that you want.
You could then loop for each element of that array and do what you want to do with each one.

I don't know what the size limitations are to that way I suggested. If you want to try that way I suggested and need help, then give us a sample text file to try

Alan

_._______________________________________________


This would be the basic coding idea I am suggesting.
I can’t test it without a sample file, and I expect it might need some tweaking. For example you might want to start the looping from 2 instead of 1, because the first element returned by the Split will likely be an empty text string above the first %T , ..... , or not, or whatever, as the case may be...
The macro probably has some typos in it as well… The macro is just intended to give you the basic idea example of what I am suggesting

Code: Select all

Sub SplitTextFiles()   '     https://eileenslounge.com/viewtopic.php?f=30&t=35341
Rem 1 Import single long text string of entire text file
Dim FileNum As Long: Let FileNum = FreeFile(1)                                  ' https://msdn.microsoft.com/en-us/vba/language-reference-vba/articles/freefile-function
Dim PathAndFileName As String, TotalFile As String
 Let PathAndFileName = ThisWorkbook.path & "\" & "MyTextFile.txt"   ' CHANGE TO SUIT
Open PathAndFileName For Binary As #FileNum 'Open Route to data. Binary is a fundemental type data input...
TotalFile = Space(LOF(FileNum)) '....and wot recives it has to be a string of exactly the right length
Get #FileNum, , TotalFile
Close #FileNum
Rem 2 Split into sub files
Dim arrFls() As String: Let arrFls() = Split(TotalFile, "%T", -1, vbBinaryCompare)
'2b) loop each text file string
Dim Cnt
    For Cnt = 1 To UBound(arrFls()) + 1                                                              '  +1 is because the  1 D  array starts at indicia  0
    '2b)(i) split by rows
    Dim arrRws() As String: Let arrRws() = Split(arrFls(Cnt - 1), vbCr & vbLf, -1, vbBinaryCompare)  '  -1 is because the  1 D  array starts at indicia  0
    Dim FlNme As String: Let FlNme = arrRws(0) ' First row is  tbl thing
    Dim FileNum2 As Long: Let FileNum2 = FreeFile(0)                                  ' https://msdn.microsoft.com/en-us/vba/language-reference-vba/articles/freefile-function
    Dim PathAndFileName2 As String
     Let PathAndFileName2 = ThisWorkbook.path & "\" & FlNme & ".txt"   ' CHANGE TO SUIT
    '2b)(ii) save each sub text file
     Open PathAndFileName2 For Output As #FileNum2 ' CHANGE TO SUIT  ' Will be made if not there
     Print #FileNum2, arrFls(Cnt - 1) '
     Close #FileNum2
    Next Cnt
End Sub
Last edited by Doc.AElstein on 12 Sep 2020, 05:57, edited 1 time in total.
I am having difficulty logging in with this account just now.
You can find me at DocAElstein also

User avatar
HansV
Administrator
Posts: 78232
Joined: 16 Jan 2010, 00:14
Status: Microsoft MVP
Location: Wageningen, The Netherlands

Re: split up a big text file

Post by HansV »

Here is my take on it:

Code: Select all

Sub SplitFile()
    Dim strFolder As String
    Dim strInFile As String
    Dim strOutFile As String
    Dim intIn As Integer
    Dim intOut As Integer
    Dim strLines As String
    Dim arrLines() As String
    Dim i As Long
    ' Change as needed but keep the trailing \
    strFolder = "C:\Excel\"
    strInFile = "BigFile.txt"
    intIn = FreeFile
    Open strFolder & strInFile For Input As #intIn
    strLines = Input(LOF(intIn), #intIn)
    Close #intIn
    arrLines = Split(strLines, vbCrLf)
    Do
        If Left(arrLines(i), 2) = "%T" Then
            intOut = FreeFile
            strOutFile = Mid(arrLines(i), 4) & ".txt"
            Open strOutFile For Output As #intOut
            Do While Left(arrLines(i + 1), 2) = "%R"
                i = i + 1
                Print #intOut, Mid(arrLines(i), 4)
                If i + 1 > UBound(arrLines) Then Exit Do
            Loop
            Close #intOut
        End If
        i = i + 1
    Loop Until i > UBound(arrLines)
End Sub
Best wishes,
Hans

CData
3StarLounger
Posts: 308
Joined: 24 Dec 2015, 16:41

Re: split up a big text file

Post by CData »

thanks both... will work on it over the weekend (maybe...) and report back....

User avatar
Doc.AElstein
BronzeLounger
Posts: 1499
Joined: 28 Feb 2015, 13:11
Location: Hof, Bayern, Germany

Re: split up a big text file

Post by Doc.AElstein »

CData wrote:
11 Sep 2020, 15:02
..will work on it over the weekend (maybe...) and report back....
A few passing extra notes/remarks for you to consider…
_ I think Hans and my macro use the same basic way, the ..” VBA Open For __ As way …” : https://www.homeandlearn.org/open_a_tex ... n_vba.html
_ The start point is identical in both macros: The entire text file, BigFile.txt , is put in a single string variable, ( which in my macro is in variable TotalFile , ( and in Hans macro is in variable strLines ) )
_ The main difference in the two macros is that
Hans loops through all the lines writing each line out to a text file,
whereas mine chops the big file up into as many ( or one more ) than there are sub files and writes the entire sub files out in one go.
My first macro will also have the "%R " showing in the lines in the produced text files, whereas Hans macro will not show "%R " in the lines in the produced text files because it writes each line out from the 4th character by virtue of this bit
Mid(arrLines(i), 4)
In my macro you can introduce a code line like
Replace(TotalFile, "%R ", "", 1, -1, vbBinaryCompare)
which will remove all those "%R " in one go from the main large single text string of the big text file, ( which in my macro is in variable TotalFile , ( and in Hans macro is in variable strLines ) )

_.____________________

For comparison, below are two versions of mine and Hans macro which produce the same results.
I tested using a text file , BigFile.txt , looking like this
%T tblAddress
%R row a1 data
%R row a2 data
%T tblProduct
%R row b1 data
%R row b2 data
%R row b3 data


Both macros produce these two text files, tblProduct.txt
row b1 data
row b2 data
row b3 data


and tblAddress.txt
row a1 data
row a2 data

Code: Select all

 Sub SplitTextFiles2()   '     https://eileenslounge.com/viewtopic.php?p=274687#p274687
Rem 1 Import single long text string of entire text file
Dim FileNum As Long: Let FileNum = FreeFile(1)                                  ' https://msdn.microsoft.com/en-us/vba/language-reference-vba/articles/freefile-function
Dim PathAndFileName As String, TotalFile As String
 Let PathAndFileName = ThisWorkbook.path & "\" & "BigFile.txt"         ' CHANGE TO SUIT
Open PathAndFileName For Binary As #FileNum 'Open Route to data. Binary is a fundemental type data input...
TotalFile = Space(LOF(FileNum)) '....and wot recives it has to be a string of exactly the right length
Get #FileNum, , TotalFile
Close #FileNum
Rem 2 Split into sub files
 Let TotalFile = Replace(TotalFile, "%R ", "", 1, -1, vbBinaryCompare)
Dim arrFls() As String: Let arrFls() = Split(TotalFile, "%T ", -1, vbBinaryCompare)
'2b) loop each text file string
Dim Cnt
    For Cnt = 2 To UBound(arrFls()) + 1  ' using 2 will ignore the first element that might be empty                 '  +1 is because the  1 D  array starts at indicia  0
    '2b)(i)a split by rows to get first row
    Dim arrRws() As String: Let arrRws() = Split(arrFls(Cnt - 1), vbCr & vbLf, -1, vbBinaryCompare)                  '  -1 is because the  1 D  array starts at indicia  0
    Dim FlNme As String: Let FlNme = arrRws(0) ' First row is  tbl thing
    Dim FileNum2 As Long: Let FileNum2 = FreeFile(0)                                  ' https://msdn.microsoft.com/en-us/vba/language-reference-vba/articles/freefile-function
    Dim PathAndFileName2 As String
     Let PathAndFileName2 = ThisWorkbook.path & "\" & FlNme & ".txt"   ' CHANGE path TO SUIT
    '2b)(ii) save each sub text file
     Open PathAndFileName2 For Output As #FileNum2 ' CHANGE TO SUIT  ' Will be made if not there
     Print #FileNum2, Replace(arrFls(Cnt - 1), FlNme & vbCr & vbLf, "", 1, -1, vbBinaryCompare) ' write out entire text file  ( removing the file name )
     Close #FileNum2
    Next Cnt
End Sub


Sub SplitFile()  '  https://eileenslounge.com/viewtopic.php?p=274672#p274672
Dim strFolder As String, strInFile As String, strOutFile As String
Dim intIn As Integer, intOut As Integer
Dim strLines As String
Dim arrLines() As String
Dim i As Long
strFolder = ThisWorkbook.path & "\" ' CHANGE TO SUIT
strInFile = "BigFile.txt"
intIn = FreeFile
Open strFolder & strInFile For Input As #intIn
strLines = Input(LOF(intIn), #intIn)               '  https://docs.microsoft.com/en-us/office/vba/language/reference/user-interface-help/input-function
Close #intIn
arrLines = Split(strLines, vbCrLf)
Do
    If Left(arrLines(i), 2) = "%T" Then            '  i is 0 initially which is also the first indicia in an array produced by  Split
        intOut = FreeFile
        strOutFile = Mid(arrLines(i), 4) & ".txt"  '  i is 0 initially which is also the first indicia in an array produced by  Split
        Open strOutFile For Output As #intOut
        Do While Left(arrLines(i + 1), 2) = "%R"
            i = i + 1
            Print #intOut, Mid(arrLines(i), 4)
            If i + 1 > UBound(arrLines) Then Exit Do
        Loop
        Close #intOut
    End If
    i = i + 1
Loop Until i > UBound(arrLines)
End Sub 

( Edit P.S. You have touched on this way of importing a text file into VBA as a text string, and using Split on it , here : https://eileenslounge.com/viewtopic.php ... 62#p248062 )
You do not have the required permissions to view the files attached to this post.
Last edited by Doc.AElstein on 12 Sep 2020, 20:09, edited 2 times in total.
I am having difficulty logging in with this account just now.
You can find me at DocAElstein also

CData
3StarLounger
Posts: 308
Joined: 24 Dec 2015, 16:41

Re: split up a big text file

Post by CData »

quick thanks from my phone , appreciate the further explanation as I want to learn.... will get on the machine and do it right soon and report back further....

LisaGreen
5StarLounger
Posts: 964
Joined: 08 Nov 2012, 17:54

Re: split up a big text file

Post by LisaGreen »

I'm pretty sure... without looking it up... that the string max is 65k.

Lisa

CData
3StarLounger
Posts: 308
Joined: 24 Dec 2015, 16:41

Re: split up a big text file

Post by CData »

greetings - so I shouldn't be working on a Sunday but's raining and there's no good sports on TV so....
put both code sets into their own Access button; entered the actual path/file info - - and ran...
both had the exact same result: nothing!!
so.....
first looked at Han's code as it was familiar and approachable to me as I've worked with similar styled code stacks...
turns out 1 line needed tweak: within the Do clause
Open strOutFile For Output As #intOut needed the path and changed to:
Open strFolder & strOutFile For Output As #intOut

..... and voila - as the french say - -

so Doc I did look at your code stack though briefly and nothing jumped out like it did in Hans - and since a bird in hand is better than more code in the bush I didn't take things farther - sorry. Am saving your code stack but didn't have the ambition to break down by line today to resolve....your style/syntax is less familiar to me and I'm probably looking past something obvious...

much thanks for the input - and the clear explanation - very appreciated as always to all volunteers....

ps Lisa's input gave me pause - so I opened BigFile in Word and the Character count with spaces is 271k - - so we are ok on this front.