Flow programme implementation
-
I'm editing my question because from the algorithm stage, I've moved to the code writing mode.
I'm not arguing.
Interesting. How to make the files properly and record my stats on all the files.
An estimated contents of text files are like that.user adres trafik data User1 Yandex 110 26.06 User2 Yandex 600 23.07 User3 Google 700 12.08 User1 Yahoo 800 28.08 User3 Google 100 13.09 User2 Yandex 120 14.09 User1 Google 140 27.09 User3 Yahoo 100 23.10 User2 Google 150 16.11 User1 Yandex 160 17.11 User3 Yahoo 110 24.11 User2 Google 700 25.11 User1 Yandex 900 18.12
I've found and set up a statistical procedure, but I need a clue on how to move on.
The laboratory has been given such an assignment (language C#)
Assuming that some of the catalogues on the disk retain a large number of files with logs (work magazines) of the proxy server, to write a programme of computing statistics on Internet traffic. At the exit, the programme should produce three text documents: user statistics, domain statistics, date statistics. As statistics, use the total amount of traffic consumed, respectively, by the user for all days, when approaching the house on that day.
In developing the programme, each file should be treated separately in parallel with the code section. After processing all files, the results for each of them should be summarized in a general summary.using System; using System.Collections.Generic; using System.Linq; using System.Threading; using System.Text; using System.IO; using System.Diagnostics.Eventing.Reader; using System.Threading.Tasks;
namespace Лабораторная_2
{class countFiles { ///string archiveDirectory = @"D:\logfiles"; string[] filelist = Directory.GetFiles(@"D:\logfiles", "*.txt"); public string user { get; set; } public string adres { get; set; } public string trafik { get; set; } public string data { get; set; } void parsingfiles(string line) { foreach (string file_to_read in filelist) { } string[] parts = line.Split('\t'); user = parts[0]; adres = parts[1]; trafik = parts[2]; data = parts[3]; } public void ReadFile(string filename) { using (StreamReader sr = new StreamReader(filename)) { string line; while ((line = sr.ReadLine()) != null) { parsingfiles(line); } } }
}
class StatLog:countFiles { public Dictionary<String, UInt64> userstat; public Dictionary<String, UInt64> adrestat; public Dictionary<int, UInt64> trafikstat; public StatLog()//конструктор { userstat = new Dictionary<String, UInt64>(); adrestat = new Dictionary<String, UInt64>(); trafikstat = new Dictionary<int, UInt64>(); } public StatLog createStat(StatLog info)// { UInt64 value; foreach (var item in info.userstat) { userstat[item.Key] = (userstat.TryGetValue(item.Key, out value) ? value : 0) + item.Value; } foreach (var item in info.adrestat) { adrestat[item.Key.Trim()] = (adrestat.TryGetValue(item.Key.Trim(), out value) ? value : 0) + item.Value; } foreach (var item in info.trafikstat) { trafikstat[item.Key] = (trafikstat.TryGetValue(item.Key, out value) ? value : 0) + item.Value; } return this; } } class injectfile : StatLog { static Queue<String> m_workFiles = new Queue<String>();//очередь файлов с каталога static System.Collections.Generic.List<StatLog> m_threadResult;//результат выполнения потока static bool m_iscomplete = false;//флаг завершения ввода static readonly object m_locker = new object(); //мьютекс для работы с файлами выводимых в очередь public void injected(String filelist)//выгрузка файлов в очередь для обработки { if (!Directory.Exists(filelist)) { return; } lock (m_locker) { foreach (var x in Directory.EnumerateFiles(filelist)) { m_workFiles.Enqueue(x); } } m_iscomplete = true;//установить флаг завершения } static StatLog processFile(String file)//обработка файла { if (!File.Exists(file))///Проверка на наличие файла в каталоге { return new StatLog(); } StreamReader sr = System.IO.File.OpenText(file); String line; StatLog sc = new StatLog();// while ((line = sr.ReadLine()) != null) { var e = countFiles.parsingfile(line); } return sc; } static public void threadFunc(int id) { String value; StatLog localStat = m_threadResult[id]; value = m_workFiles.Dequeue();//получение первого файла из очереди с удалением его из очереди var fileStat = processFile(value);//обработать файл localStat.createStat(fileStat);//объединить статистику } } class Program { static void Main(string[] args) { int threadCount = 7;//задаём количество потоков string[] filelist = Directory.GetFiles(@"D:\logfiles", "*.txt"); System.Threading.Thread[] threads = new System.Threading.Thread[threadCount]; injectfile inF=new injectfile(); inF.injected(filelist); foreach (var t in threads) { inF.createStat(); } //Создаём поток StreamWriter sw = new StreamWriter(@"D:\userstat.txt"); //Пишем в файл for (int i = 0; i < .Count; i++) { ///sw.WriteLine(Txt_Struct[i].user); } sw.Close(); StreamWriter sw1 = new StreamWriter(@"C:\adrestat.txt"); //Пишем в файл for (int i = 0; i < .Count; i++) { //sw1.WriteLine(Txt_Struct[i].adres); } sw1.Close(); StreamWriter sw2 = new StreamWriter(@"D:\trafikstat.txt"); //Пишем в файл for (int i = 0; i < .Count; i++) { /// sw2.WriteLine(Txt_Struct[i].trafik); } sw2.Close(); } }
}
-
You found your files. You don't have to do anything. We need to clear these files and aggregate the data. Then you can take these processed data out or store or do what you like. Files are one-time if I get it right. So you can start one code in a few streams.
At the same time, the data aggregation for each file can be done inside the flow, and at the end it can be transferred to the main flow and it's already loaded. You can clear the line of the league and transmit the necessary data to the parent stream. The difference is that, in the first case, we are more remembranced, but we do not push elbows on the transmission of data between flows, but in the second, it is the opposite.
The bulk of the flow requires some manager. In the simplest version, it starts the number of flows by number of logs, accepts the data until all the cracks work. That's the way it's gonna be-- it's gonna get cold because of the CD. In fact, such a classic example of evaporation would appear to be the case and does not show a marked reduction in the time of the task. Anyway, we need to look at a specific target and not bleed when it does not produce productivity gains.
The second time that slows down the task is to clear the line of the leg. The simplest and convenient thing is to roll the regular and get everything on the dish. But it's not the most effective. I have a similar task with 31 gigabytes at 51 minutes and splinters and sabstring at 8.
With regard to the multiplicity of c#, there's MSDN in your hands and examples of the sea.
Well, the most important thing is, since you've got this job. Draw up your multi-point code, play with him, and then tell the predator when it's not good. And you'll get your gun cleanly.