Why does pandas join with mis-sorted multiindex give stack overflow?

2

When joining two pandas dataframes with MultiIndex that are not "in order", my pandas installation makes python crash with error code 0xC00000FD. It took me a while to find my bug, and when I found it I got even more confused. Why did this happen, and how should I get better at spotting it?

Consider the following code:

import pandas as pd

df = pd.DataFrame.from_records([
    ("foo", 1, .1),
    ("foo", 2, .2),
    ("bar", 1, .3),
    ("bar", 2, .4)
], columns=["Level1", "Level2", "Value"])


df2 = df.set_index(["Level1","Level2"])
df3 = df.set_index(["Level2","Level1"])

combination = pd.merge(left=df2, right=df2, left_index=True, right_index=True)
print("ok")
combination2 = pd.merge(left=df2, right=df3, left_index=True, right_index=True)
print("fail")

It gives the following output:

  Level1  Level2  Value
0    foo       1    0.1
1    foo       2    0.2
               Value
Level1 Level2       
foo    1         0.1
       2         0.2
               Value
Level2 Level1       
1      foo       0.1
2      foo       0.2
ok
Process finished with exit code -1073741571 (0xC00000FD)

The problem arose when I had a table, made some pivots/melts, and wanted to go back to the original multiindex and join my results. At first I did not understand the error message. From this page I learned that the error code is a stack overflow. From this PR to pandas I learned that the recursion is somehow related to calculating group indices. This information made me able to hunt down my bug and also create a minimal reproducing example.

Now to my question:

  • Why does this happen? Shouldn't Pandas give some more informative error? My intuition says that it should be easy to spot.
  • Is there any way to programatically identify these errors instead of finding yourself with a stack overflow, uninformative errors and a broken debugger?

EDIT The error as stated occurs in the latsest python and pandas to date, pandas 0.24.2 and python 3.7.3

python
pandas
dataframe
recursion
join
asked on Stack Overflow May 28, 2019 by LudvigH • edited May 28, 2019 by LudvigH

0 Answers

Nobody has answered this question yet.


User contributions licensed under CC BY-SA 3.0