Flow programme implementation



  • I'm editing my question because from the algorithm stage, I've moved to the code writing mode.

    I'm not arguing.
    Interesting. How to make the files properly and record my stats on all the files.
    An estimated contents of text files are like that.

    user    adres     trafik  data   User1   Yandex    110     26.06 
    User2   Yandex    600     23.07  User3   Google    700     12.08 
    User1   Yahoo     800     28.08  User3   Google    100     13.09 
    User2   Yandex    120     14.09  User1   Google    140     27.09 
    User3   Yahoo     100     23.10  User2   Google    150     16.11 
    User1   Yandex    160     17.11  User3   Yahoo     110     24.11 
    User2   Google    700     25.11  User1   Yandex    900     18.12
    

    I've found and set up a statistical procedure, but I need a clue on how to move on.

    The laboratory has been given such an assignment (language C#)

    Assuming that some of the catalogues on the disk retain a large number of files with logs (work magazines) of the proxy server, to write a programme of computing statistics on Internet traffic. At the exit, the programme should produce three text documents: user statistics, domain statistics, date statistics. As statistics, use the total amount of traffic consumed, respectively, by the user for all days, when approaching the house on that day.
    In developing the programme, each file should be treated separately in parallel with the code section. After processing all files, the results for each of them should be summarized in a general summary.

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Threading;
    using System.Text;
    using System.IO;
    using System.Diagnostics.Eventing.Reader;
    using System.Threading.Tasks;
    

    namespace Лабораторная_2
    {

    class countFiles
    {      
        ///string archiveDirectory = @"D:\logfiles";
        string[] filelist = Directory.GetFiles(@"D:\logfiles", "*.txt");
        public string user 
        { get; set; }
        public string adres
        { get; set; }
        public string trafik
        { get; set; }
        public string data
        { get; set; }
    
        void parsingfiles(string line)
        {
            foreach (string file_to_read in filelist)
            {
    
            }
            string[] parts = line.Split('\t');
            user = parts[0];
            adres = parts[1];
            trafik = parts[2];
            data = parts[3];
        }
    
        public void ReadFile(string filename)
        {
            using (StreamReader sr = new StreamReader(filename))
            {
                string line;
                while ((line = sr.ReadLine()) != null)
                {
                    parsingfiles(line);
                }
            }
        }  
    

    }

    class StatLog:countFiles
    {
       public Dictionary<String, UInt64> userstat;
       public Dictionary<String, UInt64> adrestat;
       public Dictionary<int, UInt64> trafikstat;
    
       public StatLog()//конструктор 
       {
           userstat = new Dictionary<String, UInt64>();
           adrestat = new Dictionary<String, UInt64>();
           trafikstat = new Dictionary<int, UInt64>();
       }
    
     public StatLog createStat(StatLog info)//
     {
         UInt64 value;
         foreach (var item in info.userstat)
            {              
                userstat[item.Key] = (userstat.TryGetValue(item.Key, out value) ? value : 0) + item.Value;
            }
            foreach (var item in info.adrestat)
            {
                adrestat[item.Key.Trim()] = (adrestat.TryGetValue(item.Key.Trim(), out value) ? value : 0) + item.Value;
            }
            foreach (var item in info.trafikstat)
            {
                trafikstat[item.Key] = (trafikstat.TryGetValue(item.Key, out value) ? value : 0) + item.Value;
            }
            return this;
     }  
    }
    
    class injectfile : StatLog
    {
        static Queue<String> m_workFiles = new Queue<String>();//очередь файлов с каталога
        static System.Collections.Generic.List<StatLog> m_threadResult;//результат выполнения потока
        static bool m_iscomplete = false;//флаг завершения ввода
        static readonly object m_locker = new object(); //мьютекс для работы с файлами выводимых в очередь
    
        public void injected(String filelist)//выгрузка файлов в очередь для обработки
        {
            if (!Directory.Exists(filelist))
            {
                return;
            }
            lock (m_locker)
            {
                foreach (var x in Directory.EnumerateFiles(filelist))
                {
                    m_workFiles.Enqueue(x);
                }
            }
            m_iscomplete = true;//установить флаг завершения
        }
    
        static StatLog processFile(String file)//обработка файла
        {
            if (!File.Exists(file))///Проверка на наличие файла в каталоге
            {
                return new StatLog();
            }
            StreamReader sr = System.IO.File.OpenText(file);
            String line;
            StatLog sc = new StatLog();//
            while ((line = sr.ReadLine()) != null)
            {
                var e = countFiles.parsingfile(line);               
            }
            return sc;
        }
    
    
        static public void threadFunc(int id)
        {
            String value;
            StatLog localStat = m_threadResult[id];
                value = m_workFiles.Dequeue();//получение первого файла из очереди с удалением его из очереди
                var fileStat = processFile(value);//обработать файл
                localStat.createStat(fileStat);//объединить статистику
        }       
    }
    
    class Program
    {
            static void Main(string[] args)
        {
    
           int threadCount = 7;//задаём количество потоков
           string[] filelist = Directory.GetFiles(@"D:\logfiles", "*.txt");
           System.Threading.Thread[] threads = new System.Threading.Thread[threadCount];          
           injectfile inF=new injectfile();
                inF.injected(filelist);
                 foreach (var t in threads)
                 {
                     inF.createStat();
            }
            //Создаём поток
            StreamWriter sw = new StreamWriter(@"D:\userstat.txt");
            //Пишем в файл
            for (int i = 0; i < .Count; i++)
            {
                ///sw.WriteLine(Txt_Struct[i].user);
            }
            sw.Close();
    
    
            StreamWriter sw1 = new StreamWriter(@"C:\adrestat.txt");
            //Пишем в файл
            for (int i = 0; i < .Count; i++)
            {
                //sw1.WriteLine(Txt_Struct[i].adres);
    
            }
            sw1.Close();
    
            StreamWriter sw2 = new StreamWriter(@"D:\trafikstat.txt");
            //Пишем в файл
            for (int i = 0; i < .Count; i++)
            {
               /// sw2.WriteLine(Txt_Struct[i].trafik);             
            }
            sw2.Close();
        }
    }
    

    }



  • You found your files. You don't have to do anything. We need to clear these files and aggregate the data. Then you can take these processed data out or store or do what you like. Files are one-time if I get it right. So you can start one code in a few streams.

    At the same time, the data aggregation for each file can be done inside the flow, and at the end it can be transferred to the main flow and it's already loaded. You can clear the line of the league and transmit the necessary data to the parent stream. The difference is that, in the first case, we are more remembranced, but we do not push elbows on the transmission of data between flows, but in the second, it is the opposite.

    The bulk of the flow requires some manager. In the simplest version, it starts the number of flows by number of logs, accepts the data until all the cracks work. That's the way it's gonna be-- it's gonna get cold because of the CD. In fact, such a classic example of evaporation would appear to be the case and does not show a marked reduction in the time of the task. Anyway, we need to look at a specific target and not bleed when it does not produce productivity gains.

    The second time that slows down the task is to clear the line of the leg. The simplest and convenient thing is to roll the regular and get everything on the dish. But it's not the most effective. I have a similar task with 31 gigabytes at 51 minutes and splinters and sabstring at 8.

    With regard to the multiplicity of c#, there's MSDN in your hands and examples of the sea.

    Well, the most important thing is, since you've got this job. Draw up your multi-point code, play with him, and then tell the predator when it's not good. And you'll get your gun cleanly.


Log in to reply
 


Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2