company_banner

Tips and tricks from my Telegram-channel @pythonetc, March 2019

    image

    It is a new selection of tips and tricks about Python and programming from my Telegram-channel @pythonetc.

    Previous publications.

    0_0


    0_0 is a totally valid Python expression.

    Sorting a list with None


    Sorting a list with None values can be challenging:

    In [1]: data = [
       ...:     dict(a=1),
       ...:     None,
       ...:     dict(a=-3),
       ...:     dict(a=2),
       ...:     None,
       ...: ]
    
    In [2]: sorted(data, key=lambda x: x['a'])
    ...
    TypeError: 'NoneType' object is not subscriptable

    You may try to remove Nones and put them back after sorting (to the end or the beginning of the list depending on your task):

    In [3]: sorted(
       ...:     (d for d in data if d is not None),
       ...:     key=lambda x: x['a']
       ...: ) + [
       ...:     d for d in data if d is None
       ...: ]
    Out[3]: [{'a': -3}, {'a': 1}, {'a': 2}, None, None]

    That's a mouthful. The better solution is to use more complex key:

    In [4]: sorted(data, key=lambda x: float('inf') if x is None else x['a'])
    Out[4]: [{'a': -3}, {'a': 1}, {'a': 2}, None, None]

    For types where no infinity is available you can sort tuples instead:

    In [5]: sorted(data, key=lambda x: (1, None) if x is None else (0, x['a']))
    Out[5]: [{'a': -3}, {'a': 1}, {'a': 2}, None, None]
    

    Calling random.seed()


    When you fork your process, the random seed you are using is copying across processes. That may lead to processes producing the same «random» result.

    To avoid this, you have to manually call random.seed() in every process.

    However, that is not the case if you are using the multiprocessing module, it is doing exactly that for you.

    Here is an example:

    import multiprocessing              
    import random                       
    import os                           
    import sys                          
                                        
    def test(a):                        
        print(random.choice(a), end=' ')
                                        
    a = [1, 2, 3, 4, 5]                 
                                        
    for _ in range(5):                  
        test(a)                         
    print()                             
                                        
                                        
    for _ in range(5):                  
        p = multiprocessing.Process(    
            target=test, args=(a,)      
        )                               
        p.start()                       
        p.join()                        
    print()                             
                                        
    for _ in range(5):                  
        pid = os.fork()                 
        if pid == 0:                    
            test(a)                     
            sys.exit()                  
        else:                           
            os.wait()                   
    print()

    The result is something like:

    4 4 4 5 5
    1 4 1 3 3
    2 2 2 2 2

    Moreover, if you are using Python 3.7 or newer, os.fork does the same as well, thanks to the new at_fork hook.

    The output of the above code for Python 3.7 is:

    1 2 2 1 5
    4 4 4 5 5
    2 4 1 3 1
    

    Adding to 0


    It looks like sum([a, b, c]) is equivalent for a + b + c, while in fact it's 0 + a + b + c. That means that it can't work with types that don't support adding to 0:

    class MyInt:
        def __init__(self, value):
            self.value = value
        def __add__(self, other):
            return type(self)(self.value + other.value)
        def __radd__(self, other):
            return self + other
        def __repr__(self):
            class_name = type(self).__name__
            return f'{class_name}({self.value})'
    In : sum([MyInt(1), MyInt(2)])
    ...
    AttributeError: 'int' object has no attribute 'value'

    To fix that you can provide custom start element that is used instead of 0:

    In : sum([MyInt(1), MyInt(2)], MyInt(0))
    Out: MyInt(3)

    sum is well-optimized for summation of float and int types but can handle any other custom type. However, it refuses to sum bytes, bytearray and str since join is well-optimized for this operation:

    In : sum(['a', 'b'], '')
    ...
    TypeError: sum() can't sum strings [use ''.join(seq) instead]
    In : ints = [x for x in range(10_000)]
    In : my_ints = [Int(x) for x in ints]
    In : %timeit sum(ints)
    68.3 µs ± 142 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    In : %timeit sum(my_ints, Int(0))
    5.81 ms ± 20.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)



    Index completions in Jupyter Notebook


    You can customize index completions in Jupyter notebook by providing the _ipython_key_completions_ method. This way you can control what is displayed when you press tab after something like d["x:



    Note that the method doesn't get the looked up string as an argument.
    Mail.ru Group
    975.15
    Строим Интернет
    Share post

    Comments 0

    Only users with full accounts can post comments. Log in, please.