Multiproccesing is dependent on a large bulb of processes



  • Hello. There's a code like this:

    def actions(host,i):
        while True:
               time.sleep(1)
    

    def main():
    hosts = []
    with open('host.txt') as hosts_file:
    for line in hosts_file:
    hosts.append(line.strip())

    for (i, host) in enumerate(hosts):
    thread = Process(target=actions, args=(host,i))        
            thread.start()
    

    if name == 'main':
    try:
    main()
    except KeyboardInterrupt:
    pass

    The code collects a list of hosts from the hosts.txt file and sends every single host from the file to its process (a multi-point field) in which there are endless time-outs with each host. The problem is, even if the function of actions without a basic code (as in the example above), after some time from the start, the programme depends on the whole OS (Ubuntu 16.04), it seems like the entire operational memory is dried up and 32GB. Please indicate how to optimize this code.



  • Optimization number zero is to clarify the task, to come up with another approach. It is likely that 3,000 parallel processing tasks can be handled.

    The first optimization is to move from processes to flows. I understand the task is downloading I/O. The flow of memory in this case will be exactly less than the process.

    The second optimization is that 3,000 streams are not enough, and if memory is still missing, then you can try to make an asynchronous annex, look the other way. https://docs.python.org/3/library/asyncio.html




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2