Un moyen efficace de lire un gros fichier TXT délimité?
-
15-11-2019 - |
Question
J'ai un fichier TXT délimité par onglet avec 500K enregistrements.J'utilise le code ci-dessous pour lire les données sur DataSet.Avec 50k, cela fonctionne bien mais 500k il donne «l'exception du type» System.outofMemoryException 'a été lancé. "
Quel est le moyen le plus efficace de lire de grandes données délimitées? Ou comment résoudre ce problème?S'il vous plaît donnez-moi un exemple
public DataSet DataToDataSet(string fullpath, string file)
{
string sql = "SELECT * FROM " + file; // Read all the data
OleDbConnection connection = new OleDbConnection // Connection
("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + fullpath + ";"
+ "Extended Properties=\"text;HDR=YES;FMT=Delimited\"");
OleDbDataAdapter ole = new OleDbDataAdapter(sql, connection); // Load the data into the adapter
DataSet dataset = new DataSet(); // To hold the data
ole.Fill(dataset); // Fill the dataset with the data from the adapter
connection.Close(); // Close the connection
connection.Dispose(); // Dispose of the connection
ole.Dispose(); // Get rid of the adapter
return dataset;
}
La solution
Use a stream approach with TextFieldParser
- this way you will not load the whole file into memory in one go.
Autres conseils
You really want to enumerate the source file and process each line at a time. I use the following
public static IEnumerable<string> EnumerateLines(this FileInfo file)
{
using (var stream = File.Open(file.FullName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
using (var reader = new StreamReader(stream))
{
string line;
while ((line = reader.ReadLine()) != null)
{
yield return line;
}
}
}
Then for each line you can split it using tabs and process each line at a time. This keeps memory down really low for the parsing, you only use memory if the application needs it.
Have you tried the TextReader?
using (TextReader tr = File.OpenText(YourFile))
{
string strLine = string.Empty;
string[] arrColumns = null;
while ((strLine = tr.ReadLine()) != null)
{
arrColumns = strLine .Split('\t');
// Start Fill Your DataSet or Whatever you wanna do with your data
}
tr.Close();
}
I found FileHelpers
The FileHelpers are a free and easy to use .NET library to import/export data from fixed length or delimited records in files, strings or streams.
Maybe it can help.