Tuesday, May 22, 2012

Powershell: Split text file in multiple files

For importing computer entries in SCCM I had a rather big file with 35000 items in it. I wanted to take a phased approach and found a script to split the CSV file based on a number of lines per file. Script Center repository: http://gallery.technet.microsoft.com/scriptcenter/PowerShell-Split-large-log-6f2c4da0

I tweaked the script a little bit, so only the parameters that are necessary are File Name and Number of Lines Per File.

$linecount = 0
$filenumber = 1

$sourcefilename = Read-Host "What is the full path and name of the log file to split? (e.g. D:\mylogfiles\mylog.txt) "
$destinationfolderpath = Split-Path $sourcefilename -parent

$srcfile = gci $sourcefilename
$filebasename = $srcfile.BaseName
$fileext = $srcfile.Extension

Get-Content $sourcefilename | Measure-Object | ForEach-Object { $sourcelinecount = $_.Count }

Write-Host "Your current file size is $sourcelinecount lines long"

$destinationfilesize = Read-Host "How many lines will be in each new split file? "

$maxsize = [int]$destinationfilesize
 
Write-Host File is $sourcefilename - destination is $destinationfolderpath - new file line count will be $destinationfilesize

Write-Host "Writing part: $destinationfolderpath\$filebasename`_part$filenumber$fileext"
$content = get-content $sourcefilename | % {
 #Add-Content $destinationfolderpath\$filebasename_$filenumber.txt "$_"
 Add-Content $destinationfolderpath\$filebasename`_part$filenumber$fileext "$_"
  $linecount ++
  If ($linecount -eq $maxsize) {
    $filenumber++
    $linecount = 0
    Write-Host "Writing part: $destinationfolderpath\$filebasename`_part$filenumber$fileext"
  }
}

5 reacties:

Damien Bras said...
This comment has been removed by the author.
Michiel Wouters said...

Sure, you can add a check on $filenumber and $linecount and save the original first line to a variable.

Then write that variable to each next target file when the linecount is 0.

I think this code will do. Haven't tested this though. Let me hear if it works.

Replace:
$content = get-content $sourcefilename | % {

With:
$content = get-content $sourcefilename | % {
if ($filenumber -eq 1) {
# This is the first file
if ($linecount = 0) {
# This the the first line of the first file, thus first line of source file
$headerline = $_
}
} else {
# This is not the first file
if ($linecount -eq 0) {
# This is a new file, write headerline
Add-Content $destinationfolderpath\$filebasename`_part$filenumber$fileext "$headerline"
}
}

JBA.windycity said...

The original script works perfect. Thank you so much for making this availible. I also wanted to have the header at the top of each file so I tried your suggestion.

Looks like there may be a typo -
This is what you gave:
$content = get-content $sourcefilename | % {
if ($filenumber -eq 1) {
# This is the first file
if ($linecount = 0) {

I think this should be
$content = get-content $sourcefilename | % {
if ($filenumber -eq 1) {
# This is the first file
if ($linecount -eq 0) {

dukebd06 said...

say there is 100 lines in the file and you want it broken into 2 files. How do you get it to write line 1 to log1, line 2 to log2, line 3 to log1, line 4 to log4 and so on?

Michiel Wouters said...

@dukebd06
This is rather easy, so I posted it as a new script: http://michielw.blogspot.nl/2014/01/powershell-divide-text-file-lines.html

Post a Comment