top of page

Beispiel für schnelles Lesen und Schreiben von Dateien / Example of fast reading and writing of files

(alle Versionen/all versions)

Sonntag, 2. Juli 2023

Hintergrund / Background

Code

Parameters

Main file access code



Hintergrund / Background


Bei der Verarbeitung von sehr großen Dateien kommt es auf Geschwindigkeit an.

Es empfiehlt sich, direkt mit den System.IO-Klassen "FileStream" , "StreamReader" und "StreamWriter" zu arbeiten.

Im vorliegenden Anwendungsfall musste ein Konvertierungsprogramm erstellt werden, welches Dateien, die nicht 100% im CSV-Format vorlagen, in ein korrektes CSV-Format überführt.


When processing very large files, speed is of the essence.

It's a good idea to work directly with the System.IO "FileStream" , "StreamReader" , and "StreamWriter" classes.

In the present application, a conversion program had to be created that converts files that were not 100% in CSV format into the correct CSV format.


Code


Parameters

/// <param name="sourceFileName">Required: The source file name.</param>

/// <param name="targetFileName">Required: The target file name.</param>

/// <param name="fileEncoding">Required: The general file encoding.</param>

/// <param name="targetFileEndOfLineCharacter">Required: The character which indicates the end of line.</param>

/// <param name="fieldSeparationCharacter">Required: The character which indicates a field separation.</param>

/// <param name="fieldQuotationCharacter">Required: The character which is used for field quotations. Will be removed for <paramref name="targetFileName"/> by this method.</param>

/// <param name="requiredMaxFieldCount">Required: The maximum required field count by the target additional programming. The <paramref name="targetFileName"/> will have only this maximum amount of data columns.</param>

string sourceFileName;,
string targetFileName;
System.Text.Encoding fileEncoding;
string targetFileEndOfLineCharacter;
string fieldSeparationCharacter;
string fieldQuotationCharacter;
int requiredMaxFieldCount;

Main file access code

var bufferSize = 64 * 1024; -- TODO: Can be adjusted to your needs!

var textHelperSourceLine = new System.Text.StringBuilder();

using ( var sourceFileStream = new System.IO.FileStream( sourceFileName , System.IO.FileMode.Open , System.IO.FileAccess.Read , System.IO.FileShare.None , bufferSize , System.IO.FileOptions.SequentialScan ) )

using ( var sourceFileReader = new System.IO.StreamReader( sourceFileStream , fileEncoding ) )

-- Note, there is a file mode to append to existing files also!
using ( var targetFileStream = new System.IO.FileStream( targetFileName , System.IO.FileMode.CreateNew , System.IO.FileAccess.Write , System.IO.FileShare.None , bufferSize , System.IO.FileOptions.WriteThrough ) )

using ( var targetFileWriter = new System.IO.StreamWriter( targetFileStream , fileEncoding ) )
{

  while ( sourceFileReader.Peek() >= 0 )
  {
    -- TODO: Please note that this approach is only suitable in this special use case for our project!
    -- It depends on your own project how to convert the source data!

    textHelperSourceLine.Clear();

    textHelperSourceLine.Append( sourceFileReader.ReadLine().Replace( fieldQuotationCharacter , string.Empty ) );

    var textColumns = textHelperSourceLine.ToString().Split( fieldSeparationCharacter.ToCharArray() );

    textHelperSourceLine.Clear();

    for ( int i = 0 ; i < requiredMaxFieldCount ; i++ )
    {
      if ( i > 0 ) targetFileWriter.Write( fieldSeparationCharacter );

      if ( i < textColumns.Length )
      {
        targetFileWriter.Write( textColumns[i] );
      }
    }

    targetFileWriter.Write( targetFileEndOfLineCharacter );
  }

}

bottom of page