Improve Performance of Comparing two Numpy ArraysImplementing F1 scoreDefensive programming type-checkingRecursive function, high performance criticalHints to make Sudoku solver more PythonicMatrix rotation algorithmHackerRank “Nested Lists” CodeStudents with second lowest gradeReturn a minimum number of ranges from a collection of rangesEnsuring performance of sketching/streaming algorithm (countSketch)Concordance index calculation

What do you call bracelets you wear around the legs?

Referring to a character in 3rd person when they have amnesia

Physically unpleasant work environment

What color to choose as "danger" if the main color of my app is red

Why is the S-duct intake on the Tu-154 uniquely oblong?

Why would you put your input amplifier in front of your filtering for an ECG signal?

How to draw pentagram-like shape in Latex?

how to create an executable file for an AppleScript?

Why does string strummed with finger sound different from the one strummed with pick?

Should I twist DC power and ground wires from a power supply?

Taylor series leads to two different functions - why?

Why does Taylor’s series “work”?

Largest memory peripheral for Sinclair ZX81?

on the truth quest vs in the quest for truth

FIFO data structure in pure C

Should all adjustments be random effects in a mixed linear effect?

How come Arya Stark wasn't hurt by this in Game of Thrones Season 8 Episode 5?

Divisor Rich and Poor Numbers

Quotient of Three Dimensional Torus by Permutation on Coordinates

How do we explain the use of a software on a math paper?

Parse a C++14 integer literal

How would fantasy dwarves exist, realistically?

Pedaling at different gear ratios on flat terrain: what's the point?

Who is frowning in the sentence "Daisy looked at Tom frowning"?



Improve Performance of Comparing two Numpy Arrays


Implementing F1 scoreDefensive programming type-checkingRecursive function, high performance criticalHints to make Sudoku solver more PythonicMatrix rotation algorithmHackerRank “Nested Lists” CodeStudents with second lowest gradeReturn a minimum number of ranges from a collection of rangesEnsuring performance of sketching/streaming algorithm (countSketch)Concordance index calculation






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








5












$begingroup$


I had a code challenge for a class I'm taking that built a NN algorithm. I got it to work but I used really basic methods for solving it. There are two 1D NP Arrays that have values 0-2 in them, both equal length. They represent two different trains and test data The output is a confusion matrix that shows which received the right predictions and which received the wrong (doesn't matter ;).



This code is correct - I just feel I took the lazy way out working with lists and then turning those lists into a ndarray. I would love to see if people have some tips on maybe utilizing Numpy for this? Anything Clever?



import numpy as np

x = [0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 1, 0, 2, 0, 0, 0, 0, 0, 1, 0]
y = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

testy = np.array(x)
testy_fit = np.array(y)

row_no = [0,0,0]
row_dh = [0,0,0]
row_sl = [0,0,0]

# Code for the first row - NO
for i in range(len(testy)):
if testy.item(i) == 0 and testy_fit.item(i) == 0:
row_no[0] += 1
elif testy.item(i) == 0 and testy_fit.item(i) == 1:
row_no[1] += 1
elif testy.item(i) == 0 and testy_fit.item(i) == 2:
row_no[2] += 1

# Code for the second row - DH
for i in range(len(testy)):
if testy.item(i) == 1 and testy_fit.item(i) == 0:
row_dh[0] += 1
elif testy.item(i) == 1 and testy_fit.item(i) == 1:
row_dh[1] += 1
elif testy.item(i) == 1 and testy_fit.item(i) == 2:
row_dh[2] += 1

# Code for the third row - SL
for i in range(len(testy)):
if testy.item(i) == 2 and testy_fit.item(i) == 0:
row_sl[0] += 1
elif testy.item(i) == 2 and testy_fit.item(i) == 1:
row_sl[1] += 1
elif testy.item(i) == 2 and testy_fit.item(i) == 2:
row_sl[2] += 1

confusion = np.array([row_no,row_dh,row_sl])

print(confusion)



the result of the print is correct as follow:



[[16 10 0]
[ 2 10 0]
[ 2 0 22]]









share|improve this question









$endgroup$



migrated from stackoverflow.com May 5 at 23:52


This question came from our site for professional and enthusiast programmers.













  • 1




    $begingroup$
    Good thing this got an answer on SO before it was moved. Performance questions for numpy are routine on SO.
    $endgroup$
    – hpaulj
    May 6 at 0:15

















5












$begingroup$


I had a code challenge for a class I'm taking that built a NN algorithm. I got it to work but I used really basic methods for solving it. There are two 1D NP Arrays that have values 0-2 in them, both equal length. They represent two different trains and test data The output is a confusion matrix that shows which received the right predictions and which received the wrong (doesn't matter ;).



This code is correct - I just feel I took the lazy way out working with lists and then turning those lists into a ndarray. I would love to see if people have some tips on maybe utilizing Numpy for this? Anything Clever?



import numpy as np

x = [0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 1, 0, 2, 0, 0, 0, 0, 0, 1, 0]
y = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

testy = np.array(x)
testy_fit = np.array(y)

row_no = [0,0,0]
row_dh = [0,0,0]
row_sl = [0,0,0]

# Code for the first row - NO
for i in range(len(testy)):
if testy.item(i) == 0 and testy_fit.item(i) == 0:
row_no[0] += 1
elif testy.item(i) == 0 and testy_fit.item(i) == 1:
row_no[1] += 1
elif testy.item(i) == 0 and testy_fit.item(i) == 2:
row_no[2] += 1

# Code for the second row - DH
for i in range(len(testy)):
if testy.item(i) == 1 and testy_fit.item(i) == 0:
row_dh[0] += 1
elif testy.item(i) == 1 and testy_fit.item(i) == 1:
row_dh[1] += 1
elif testy.item(i) == 1 and testy_fit.item(i) == 2:
row_dh[2] += 1

# Code for the third row - SL
for i in range(len(testy)):
if testy.item(i) == 2 and testy_fit.item(i) == 0:
row_sl[0] += 1
elif testy.item(i) == 2 and testy_fit.item(i) == 1:
row_sl[1] += 1
elif testy.item(i) == 2 and testy_fit.item(i) == 2:
row_sl[2] += 1

confusion = np.array([row_no,row_dh,row_sl])

print(confusion)



the result of the print is correct as follow:



[[16 10 0]
[ 2 10 0]
[ 2 0 22]]









share|improve this question









$endgroup$



migrated from stackoverflow.com May 5 at 23:52


This question came from our site for professional and enthusiast programmers.













  • 1




    $begingroup$
    Good thing this got an answer on SO before it was moved. Performance questions for numpy are routine on SO.
    $endgroup$
    – hpaulj
    May 6 at 0:15













5












5








5





$begingroup$


I had a code challenge for a class I'm taking that built a NN algorithm. I got it to work but I used really basic methods for solving it. There are two 1D NP Arrays that have values 0-2 in them, both equal length. They represent two different trains and test data The output is a confusion matrix that shows which received the right predictions and which received the wrong (doesn't matter ;).



This code is correct - I just feel I took the lazy way out working with lists and then turning those lists into a ndarray. I would love to see if people have some tips on maybe utilizing Numpy for this? Anything Clever?



import numpy as np

x = [0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 1, 0, 2, 0, 0, 0, 0, 0, 1, 0]
y = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

testy = np.array(x)
testy_fit = np.array(y)

row_no = [0,0,0]
row_dh = [0,0,0]
row_sl = [0,0,0]

# Code for the first row - NO
for i in range(len(testy)):
if testy.item(i) == 0 and testy_fit.item(i) == 0:
row_no[0] += 1
elif testy.item(i) == 0 and testy_fit.item(i) == 1:
row_no[1] += 1
elif testy.item(i) == 0 and testy_fit.item(i) == 2:
row_no[2] += 1

# Code for the second row - DH
for i in range(len(testy)):
if testy.item(i) == 1 and testy_fit.item(i) == 0:
row_dh[0] += 1
elif testy.item(i) == 1 and testy_fit.item(i) == 1:
row_dh[1] += 1
elif testy.item(i) == 1 and testy_fit.item(i) == 2:
row_dh[2] += 1

# Code for the third row - SL
for i in range(len(testy)):
if testy.item(i) == 2 and testy_fit.item(i) == 0:
row_sl[0] += 1
elif testy.item(i) == 2 and testy_fit.item(i) == 1:
row_sl[1] += 1
elif testy.item(i) == 2 and testy_fit.item(i) == 2:
row_sl[2] += 1

confusion = np.array([row_no,row_dh,row_sl])

print(confusion)



the result of the print is correct as follow:



[[16 10 0]
[ 2 10 0]
[ 2 0 22]]









share|improve this question









$endgroup$




I had a code challenge for a class I'm taking that built a NN algorithm. I got it to work but I used really basic methods for solving it. There are two 1D NP Arrays that have values 0-2 in them, both equal length. They represent two different trains and test data The output is a confusion matrix that shows which received the right predictions and which received the wrong (doesn't matter ;).



This code is correct - I just feel I took the lazy way out working with lists and then turning those lists into a ndarray. I would love to see if people have some tips on maybe utilizing Numpy for this? Anything Clever?



import numpy as np

x = [0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 1, 0, 2, 0, 0, 0, 0, 0, 1, 0]
y = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

testy = np.array(x)
testy_fit = np.array(y)

row_no = [0,0,0]
row_dh = [0,0,0]
row_sl = [0,0,0]

# Code for the first row - NO
for i in range(len(testy)):
if testy.item(i) == 0 and testy_fit.item(i) == 0:
row_no[0] += 1
elif testy.item(i) == 0 and testy_fit.item(i) == 1:
row_no[1] += 1
elif testy.item(i) == 0 and testy_fit.item(i) == 2:
row_no[2] += 1

# Code for the second row - DH
for i in range(len(testy)):
if testy.item(i) == 1 and testy_fit.item(i) == 0:
row_dh[0] += 1
elif testy.item(i) == 1 and testy_fit.item(i) == 1:
row_dh[1] += 1
elif testy.item(i) == 1 and testy_fit.item(i) == 2:
row_dh[2] += 1

# Code for the third row - SL
for i in range(len(testy)):
if testy.item(i) == 2 and testy_fit.item(i) == 0:
row_sl[0] += 1
elif testy.item(i) == 2 and testy_fit.item(i) == 1:
row_sl[1] += 1
elif testy.item(i) == 2 and testy_fit.item(i) == 2:
row_sl[2] += 1

confusion = np.array([row_no,row_dh,row_sl])

print(confusion)



the result of the print is correct as follow:



[[16 10 0]
[ 2 10 0]
[ 2 0 22]]






python numpy






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked May 5 at 23:36









broepkebroepke

283




283




migrated from stackoverflow.com May 5 at 23:52


This question came from our site for professional and enthusiast programmers.









migrated from stackoverflow.com May 5 at 23:52


This question came from our site for professional and enthusiast programmers.









  • 1




    $begingroup$
    Good thing this got an answer on SO before it was moved. Performance questions for numpy are routine on SO.
    $endgroup$
    – hpaulj
    May 6 at 0:15












  • 1




    $begingroup$
    Good thing this got an answer on SO before it was moved. Performance questions for numpy are routine on SO.
    $endgroup$
    – hpaulj
    May 6 at 0:15







1




1




$begingroup$
Good thing this got an answer on SO before it was moved. Performance questions for numpy are routine on SO.
$endgroup$
– hpaulj
May 6 at 0:15




$begingroup$
Good thing this got an answer on SO before it was moved. Performance questions for numpy are routine on SO.
$endgroup$
– hpaulj
May 6 at 0:15










2 Answers
2






active

oldest

votes


















5












$begingroup$

This can be implemented concisely by using numpy.add.at:



In [2]: c = np.zeros((3, 3), dtype=int) 

In [3]: np.add.at(c, (x, y), 1)

In [4]: c
Out[4]:
array([[16, 10, 0],
[ 2, 10, 0],
[ 2, 0, 22]])





share|improve this answer









$endgroup$












  • $begingroup$
    Oh my! I thought there would be something better but i didn't think 1 line of code! Wow. So glad I asked and thank you!
    $endgroup$
    – broepke
    May 6 at 2:04






  • 2




    $begingroup$
    Rule #1 of numpy is if you want to do something, check the docs first to check for a 1 line solution.
    $endgroup$
    – Oscar Smith
    May 6 at 5:39


















3












$begingroup$

For now disregarding that there is a (way) better numpy solution to this, as explained in the answer by @WarrenWeckesser, here is a short code review of your actual code.




  • testy.item(i) is a very unusual way to say testy[i]. It is probably also slower as it involves an attribute lookup.


  • Don't repeat yourself. You test e.g. if testy.item(i) == 0 three times, each time with a different second condition. Just nest them in an if block:



    for i in range(len(testy)):
    if testy[i] == 0:
    if testy_fit[i] == 0:
    row_no[0] += 1
    elif testy_fit[i] == 1:
    row_no[1] += 1
    elif testy_fit[i] == 2:
    row_no[2] += 1



  • Loop like a native. Don't iterate over the indices of iterables, iterate over the iterable(s)! You can also use the fact that the value encodes the position you want to increment:



    for test, fit in zip(testy, testy_fit):
    if test == 0 and fit in 0, 1, 2:
    row_no[fit] += 1



  • You can even use the fact that the first value encodes the list you want to use and iterate only once. Or even better, make it a list of lists right away:



    n = 3
    confusion_matrix = [[0] * n for _ in range(n)]
    for test, fit in zip(testy, testy_fit):
    confusion_matrix[test][fit] += 1

    print(np.array(confusion_matrix))



  • Don't put everything into the global space, to be run whenever you interact with the script at all. Put your code into functions, document them with a docstring, and execute them under a if __name__ == "__main__": guard, which allows you to import from this script from another script without your code running:



    def confusion_matrix(x, y):
    """Return the confusion matrix for two vectors `x` and `y`.
    x and y must only have values from 0 to n and 0 to m, respectively.
    """
    n, m = np.max(x) + 1, np.max(y) + 1
    matrix = [[0] * m for _ in range(n)]
    for a, b in zip(x, y):
    matrix[a][b] += 1
    return matrix

    if __name__ == "__main__":
    x = ...
    y = ...
    print(np.array(confusion_matrix(x, y)))


Once you have come this far, you can just swap the implementation of this function to the faster numpy one without changing anything (except that it then directly returns a numpy.array instead of a list of lists).






share|improve this answer









$endgroup$













    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "196"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f219781%2fimprove-performance-of-comparing-two-numpy-arrays%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    5












    $begingroup$

    This can be implemented concisely by using numpy.add.at:



    In [2]: c = np.zeros((3, 3), dtype=int) 

    In [3]: np.add.at(c, (x, y), 1)

    In [4]: c
    Out[4]:
    array([[16, 10, 0],
    [ 2, 10, 0],
    [ 2, 0, 22]])





    share|improve this answer









    $endgroup$












    • $begingroup$
      Oh my! I thought there would be something better but i didn't think 1 line of code! Wow. So glad I asked and thank you!
      $endgroup$
      – broepke
      May 6 at 2:04






    • 2




      $begingroup$
      Rule #1 of numpy is if you want to do something, check the docs first to check for a 1 line solution.
      $endgroup$
      – Oscar Smith
      May 6 at 5:39















    5












    $begingroup$

    This can be implemented concisely by using numpy.add.at:



    In [2]: c = np.zeros((3, 3), dtype=int) 

    In [3]: np.add.at(c, (x, y), 1)

    In [4]: c
    Out[4]:
    array([[16, 10, 0],
    [ 2, 10, 0],
    [ 2, 0, 22]])





    share|improve this answer









    $endgroup$












    • $begingroup$
      Oh my! I thought there would be something better but i didn't think 1 line of code! Wow. So glad I asked and thank you!
      $endgroup$
      – broepke
      May 6 at 2:04






    • 2




      $begingroup$
      Rule #1 of numpy is if you want to do something, check the docs first to check for a 1 line solution.
      $endgroup$
      – Oscar Smith
      May 6 at 5:39













    5












    5








    5





    $begingroup$

    This can be implemented concisely by using numpy.add.at:



    In [2]: c = np.zeros((3, 3), dtype=int) 

    In [3]: np.add.at(c, (x, y), 1)

    In [4]: c
    Out[4]:
    array([[16, 10, 0],
    [ 2, 10, 0],
    [ 2, 0, 22]])





    share|improve this answer









    $endgroup$



    This can be implemented concisely by using numpy.add.at:



    In [2]: c = np.zeros((3, 3), dtype=int) 

    In [3]: np.add.at(c, (x, y), 1)

    In [4]: c
    Out[4]:
    array([[16, 10, 0],
    [ 2, 10, 0],
    [ 2, 0, 22]])






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered May 5 at 23:41







    Warren Weckesser


















    • $begingroup$
      Oh my! I thought there would be something better but i didn't think 1 line of code! Wow. So glad I asked and thank you!
      $endgroup$
      – broepke
      May 6 at 2:04






    • 2




      $begingroup$
      Rule #1 of numpy is if you want to do something, check the docs first to check for a 1 line solution.
      $endgroup$
      – Oscar Smith
      May 6 at 5:39
















    • $begingroup$
      Oh my! I thought there would be something better but i didn't think 1 line of code! Wow. So glad I asked and thank you!
      $endgroup$
      – broepke
      May 6 at 2:04






    • 2




      $begingroup$
      Rule #1 of numpy is if you want to do something, check the docs first to check for a 1 line solution.
      $endgroup$
      – Oscar Smith
      May 6 at 5:39















    $begingroup$
    Oh my! I thought there would be something better but i didn't think 1 line of code! Wow. So glad I asked and thank you!
    $endgroup$
    – broepke
    May 6 at 2:04




    $begingroup$
    Oh my! I thought there would be something better but i didn't think 1 line of code! Wow. So glad I asked and thank you!
    $endgroup$
    – broepke
    May 6 at 2:04




    2




    2




    $begingroup$
    Rule #1 of numpy is if you want to do something, check the docs first to check for a 1 line solution.
    $endgroup$
    – Oscar Smith
    May 6 at 5:39




    $begingroup$
    Rule #1 of numpy is if you want to do something, check the docs first to check for a 1 line solution.
    $endgroup$
    – Oscar Smith
    May 6 at 5:39













    3












    $begingroup$

    For now disregarding that there is a (way) better numpy solution to this, as explained in the answer by @WarrenWeckesser, here is a short code review of your actual code.




    • testy.item(i) is a very unusual way to say testy[i]. It is probably also slower as it involves an attribute lookup.


    • Don't repeat yourself. You test e.g. if testy.item(i) == 0 three times, each time with a different second condition. Just nest them in an if block:



      for i in range(len(testy)):
      if testy[i] == 0:
      if testy_fit[i] == 0:
      row_no[0] += 1
      elif testy_fit[i] == 1:
      row_no[1] += 1
      elif testy_fit[i] == 2:
      row_no[2] += 1



    • Loop like a native. Don't iterate over the indices of iterables, iterate over the iterable(s)! You can also use the fact that the value encodes the position you want to increment:



      for test, fit in zip(testy, testy_fit):
      if test == 0 and fit in 0, 1, 2:
      row_no[fit] += 1



    • You can even use the fact that the first value encodes the list you want to use and iterate only once. Or even better, make it a list of lists right away:



      n = 3
      confusion_matrix = [[0] * n for _ in range(n)]
      for test, fit in zip(testy, testy_fit):
      confusion_matrix[test][fit] += 1

      print(np.array(confusion_matrix))



    • Don't put everything into the global space, to be run whenever you interact with the script at all. Put your code into functions, document them with a docstring, and execute them under a if __name__ == "__main__": guard, which allows you to import from this script from another script without your code running:



      def confusion_matrix(x, y):
      """Return the confusion matrix for two vectors `x` and `y`.
      x and y must only have values from 0 to n and 0 to m, respectively.
      """
      n, m = np.max(x) + 1, np.max(y) + 1
      matrix = [[0] * m for _ in range(n)]
      for a, b in zip(x, y):
      matrix[a][b] += 1
      return matrix

      if __name__ == "__main__":
      x = ...
      y = ...
      print(np.array(confusion_matrix(x, y)))


    Once you have come this far, you can just swap the implementation of this function to the faster numpy one without changing anything (except that it then directly returns a numpy.array instead of a list of lists).






    share|improve this answer









    $endgroup$

















      3












      $begingroup$

      For now disregarding that there is a (way) better numpy solution to this, as explained in the answer by @WarrenWeckesser, here is a short code review of your actual code.




      • testy.item(i) is a very unusual way to say testy[i]. It is probably also slower as it involves an attribute lookup.


      • Don't repeat yourself. You test e.g. if testy.item(i) == 0 three times, each time with a different second condition. Just nest them in an if block:



        for i in range(len(testy)):
        if testy[i] == 0:
        if testy_fit[i] == 0:
        row_no[0] += 1
        elif testy_fit[i] == 1:
        row_no[1] += 1
        elif testy_fit[i] == 2:
        row_no[2] += 1



      • Loop like a native. Don't iterate over the indices of iterables, iterate over the iterable(s)! You can also use the fact that the value encodes the position you want to increment:



        for test, fit in zip(testy, testy_fit):
        if test == 0 and fit in 0, 1, 2:
        row_no[fit] += 1



      • You can even use the fact that the first value encodes the list you want to use and iterate only once. Or even better, make it a list of lists right away:



        n = 3
        confusion_matrix = [[0] * n for _ in range(n)]
        for test, fit in zip(testy, testy_fit):
        confusion_matrix[test][fit] += 1

        print(np.array(confusion_matrix))



      • Don't put everything into the global space, to be run whenever you interact with the script at all. Put your code into functions, document them with a docstring, and execute them under a if __name__ == "__main__": guard, which allows you to import from this script from another script without your code running:



        def confusion_matrix(x, y):
        """Return the confusion matrix for two vectors `x` and `y`.
        x and y must only have values from 0 to n and 0 to m, respectively.
        """
        n, m = np.max(x) + 1, np.max(y) + 1
        matrix = [[0] * m for _ in range(n)]
        for a, b in zip(x, y):
        matrix[a][b] += 1
        return matrix

        if __name__ == "__main__":
        x = ...
        y = ...
        print(np.array(confusion_matrix(x, y)))


      Once you have come this far, you can just swap the implementation of this function to the faster numpy one without changing anything (except that it then directly returns a numpy.array instead of a list of lists).






      share|improve this answer









      $endgroup$















        3












        3








        3





        $begingroup$

        For now disregarding that there is a (way) better numpy solution to this, as explained in the answer by @WarrenWeckesser, here is a short code review of your actual code.




        • testy.item(i) is a very unusual way to say testy[i]. It is probably also slower as it involves an attribute lookup.


        • Don't repeat yourself. You test e.g. if testy.item(i) == 0 three times, each time with a different second condition. Just nest them in an if block:



          for i in range(len(testy)):
          if testy[i] == 0:
          if testy_fit[i] == 0:
          row_no[0] += 1
          elif testy_fit[i] == 1:
          row_no[1] += 1
          elif testy_fit[i] == 2:
          row_no[2] += 1



        • Loop like a native. Don't iterate over the indices of iterables, iterate over the iterable(s)! You can also use the fact that the value encodes the position you want to increment:



          for test, fit in zip(testy, testy_fit):
          if test == 0 and fit in 0, 1, 2:
          row_no[fit] += 1



        • You can even use the fact that the first value encodes the list you want to use and iterate only once. Or even better, make it a list of lists right away:



          n = 3
          confusion_matrix = [[0] * n for _ in range(n)]
          for test, fit in zip(testy, testy_fit):
          confusion_matrix[test][fit] += 1

          print(np.array(confusion_matrix))



        • Don't put everything into the global space, to be run whenever you interact with the script at all. Put your code into functions, document them with a docstring, and execute them under a if __name__ == "__main__": guard, which allows you to import from this script from another script without your code running:



          def confusion_matrix(x, y):
          """Return the confusion matrix for two vectors `x` and `y`.
          x and y must only have values from 0 to n and 0 to m, respectively.
          """
          n, m = np.max(x) + 1, np.max(y) + 1
          matrix = [[0] * m for _ in range(n)]
          for a, b in zip(x, y):
          matrix[a][b] += 1
          return matrix

          if __name__ == "__main__":
          x = ...
          y = ...
          print(np.array(confusion_matrix(x, y)))


        Once you have come this far, you can just swap the implementation of this function to the faster numpy one without changing anything (except that it then directly returns a numpy.array instead of a list of lists).






        share|improve this answer









        $endgroup$



        For now disregarding that there is a (way) better numpy solution to this, as explained in the answer by @WarrenWeckesser, here is a short code review of your actual code.




        • testy.item(i) is a very unusual way to say testy[i]. It is probably also slower as it involves an attribute lookup.


        • Don't repeat yourself. You test e.g. if testy.item(i) == 0 three times, each time with a different second condition. Just nest them in an if block:



          for i in range(len(testy)):
          if testy[i] == 0:
          if testy_fit[i] == 0:
          row_no[0] += 1
          elif testy_fit[i] == 1:
          row_no[1] += 1
          elif testy_fit[i] == 2:
          row_no[2] += 1



        • Loop like a native. Don't iterate over the indices of iterables, iterate over the iterable(s)! You can also use the fact that the value encodes the position you want to increment:



          for test, fit in zip(testy, testy_fit):
          if test == 0 and fit in 0, 1, 2:
          row_no[fit] += 1



        • You can even use the fact that the first value encodes the list you want to use and iterate only once. Or even better, make it a list of lists right away:



          n = 3
          confusion_matrix = [[0] * n for _ in range(n)]
          for test, fit in zip(testy, testy_fit):
          confusion_matrix[test][fit] += 1

          print(np.array(confusion_matrix))



        • Don't put everything into the global space, to be run whenever you interact with the script at all. Put your code into functions, document them with a docstring, and execute them under a if __name__ == "__main__": guard, which allows you to import from this script from another script without your code running:



          def confusion_matrix(x, y):
          """Return the confusion matrix for two vectors `x` and `y`.
          x and y must only have values from 0 to n and 0 to m, respectively.
          """
          n, m = np.max(x) + 1, np.max(y) + 1
          matrix = [[0] * m for _ in range(n)]
          for a, b in zip(x, y):
          matrix[a][b] += 1
          return matrix

          if __name__ == "__main__":
          x = ...
          y = ...
          print(np.array(confusion_matrix(x, y)))


        Once you have come this far, you can just swap the implementation of this function to the faster numpy one without changing anything (except that it then directly returns a numpy.array instead of a list of lists).







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered May 6 at 6:58









        GraipherGraipher

        27.9k54499




        27.9k54499



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Code Review Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f219781%2fimprove-performance-of-comparing-two-numpy-arrays%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Club Baloncesto Breogán Índice Historia | Pavillón | Nome | O Breogán na cultura popular | Xogadores | Adestradores | Presidentes | Palmarés | Historial | Líderes | Notas | Véxase tamén | Menú de navegacióncbbreogan.galCadroGuía oficial da ACB 2009-10, páxina 201Guía oficial ACB 1992, páxina 183. Editorial DB.É de 6.500 espectadores sentados axeitándose á última normativa"Estudiantes Junior, entre as mellores canteiras"o orixinalHemeroteca El Mundo Deportivo, 16 setembro de 1970, páxina 12Historia do BreogánAlfredo Pérez, o último canoneiroHistoria C.B. BreogánHemeroteca de El Mundo DeportivoJimmy Wright, norteamericano do Breogán deixará Lugo por ameazas de morteResultados de Breogán en 1986-87Resultados de Breogán en 1990-91Ficha de Velimir Perasović en acb.comResultados de Breogán en 1994-95Breogán arrasa al Barça. "El Mundo Deportivo", 27 de setembro de 1999, páxina 58CB Breogán - FC BarcelonaA FEB invita a participar nunha nova Liga EuropeaCharlie Bell na prensa estatalMáximos anotadores 2005Tempada 2005-06 : Tódolos Xogadores da Xornada""Non quero pensar nunha man negra, mais pregúntome que está a pasar""o orixinalRaúl López, orgulloso dos xogadores, presume da boa saúde económica do BreogánJulio González confirma que cesa como presidente del BreogánHomenaxe a Lisardo GómezA tempada do rexurdimento celesteEntrevista a Lisardo GómezEl COB dinamita el Pazo para forzar el quinto (69-73)Cafés Candelas, patrocinador del CB Breogán"Suso Lázare, novo presidente do Breogán"o orixinalCafés Candelas Breogán firma el mayor triunfo de la historiaEl Breogán realizará 17 homenajes por su cincuenta aniversario"O Breogán honra ao seu fundador e primeiro presidente"o orixinalMiguel Giao recibiu a homenaxe do PazoHomenaxe aos primeiros gladiadores celestesO home que nos amosa como ver o Breo co corazónTita Franco será homenaxeada polos #50anosdeBreoJulio Vila recibirá unha homenaxe in memoriam polos #50anosdeBreo"O Breogán homenaxeará aos seus aboados máis veteráns"Pechada ovación a «Capi» Sanmartín e Ricardo «Corazón de González»Homenaxe por décadas de informaciónPaco García volve ao Pazo con motivo do 50 aniversario"Resultados y clasificaciones""O Cafés Candelas Breogán, campión da Copa Princesa""O Cafés Candelas Breogán, equipo ACB"C.B. Breogán"Proxecto social"o orixinal"Centros asociados"o orixinalFicha en imdb.comMario Camus trata la recuperación del amor en 'La vieja música', su última película"Páxina web oficial""Club Baloncesto Breogán""C. B. Breogán S.A.D."eehttp://www.fegaba.com

            Vilaño, A Laracha Índice Patrimonio | Lugares e parroquias | Véxase tamén | Menú de navegación43°14′52″N 8°36′03″O / 43.24775, -8.60070

            Cegueira Índice Epidemioloxía | Deficiencia visual | Tipos de cegueira | Principais causas de cegueira | Tratamento | Técnicas de adaptación e axudas | Vida dos cegos | Primeiros auxilios | Crenzas respecto das persoas cegas | Crenzas das persoas cegas | O neno deficiente visual | Aspectos psicolóxicos da cegueira | Notas | Véxase tamén | Menú de navegación54.054.154.436928256blindnessDicionario da Real Academia GalegaPortal das Palabras"International Standards: Visual Standards — Aspects and Ranges of Vision Loss with Emphasis on Population Surveys.""Visual impairment and blindness""Presentan un plan para previr a cegueira"o orixinalACCDV Associació Catalana de Cecs i Disminuïts Visuals - PMFTrachoma"Effect of gene therapy on visual function in Leber's congenital amaurosis"1844137110.1056/NEJMoa0802268Cans guía - os mellores amigos dos cegosArquivadoEscola de cans guía para cegos en Mortágua, PortugalArquivado"Tecnología para ciegos y deficientes visuales. Recopilación de recursos gratuitos en la Red""Colorino""‘COL.diesis’, escuchar los sonidos del color""COL.diesis: Transforming Colour into Melody and Implementing the Result in a Colour Sensor Device"o orixinal"Sistema de desarrollo de sinestesia color-sonido para invidentes utilizando un protocolo de audio""Enseñanza táctil - geometría y color. Juegos didácticos para niños ciegos y videntes""Sistema Constanz"L'ocupació laboral dels cecs a l'Estat espanyol està pràcticament equiparada a la de les persones amb visió, entrevista amb Pedro ZuritaONCE (Organización Nacional de Cegos de España)Prevención da cegueiraDescrición de deficiencias visuais (Disc@pnet)Braillín, un boneco atractivo para calquera neno, con ou sen discapacidade, que permite familiarizarse co sistema de escritura e lectura brailleAxudas Técnicas36838ID00897494007150-90057129528256DOID:1432HP:0000618D001766C10.597.751.941.162C97109C0155020